choishingwan / PRSice

A software package for calculating, applying, evaluating and plotting the results of polygenic risk scores
http://prsice.info
GNU General Public License v3.0
182 stars 86 forks source link

error with --allow-inter #234

Closed dorks81 closed 3 years ago

dorks81 commented 4 years ago

Hi Sam,

I provided a list of bgen files when using PRSice v2.3.3 I noticed in your user guide you suggest using --allow-inter to speed up the clumping analysis. However, when I include the --allow-inter to the command, I get an error. Am I using the flag correctly?

Calculating allele frequencies: 0.00%malloc(): unsorted double linked list corrupted Error: Execution halted

Rscript PRSice/PRSice-2.3.3/PRSice.R --prsice /PRSice/PRSice-2.3.3/PRSice_linux --base.baseQC.csv --type bgen --allow-inter --target-list target.list --extract quickQC.snps.list --keep phenotype.csv --pheno phenotype.csv --pheno-col binary --cov covar.csv --cov-col SEX --thread 2 --binary-target T --stat OR --out binary

choishingwan commented 4 years ago

Likely caused by insufficient memory.

see if this works https://www.dropbox.com/s/z72s11kasx71891/PRSice_linux?dl=0

(I vaguely remember I did some kinda fixing on this version, but feel like I need some more testing before I push this out. So there might be other errors. Unfortunately, am currently busy on other stuff, can't revisit until next week)

Sam

dorks81 commented 4 years ago

Thank you Sam,

I ran my script with the file you provided above and received a different error. The used the same bgen files with v2.2.13 and no error was thrown. I ran the bgen file through QCTools -sample-stats and 80,000 samples were reported. The sample file contains 80,002 lines where the first two lines are ID_1 ID_2 MISSING and 0 0 0

Loading Genotype info from target

Error: Number of sample in phenotype file does not match number of samples specified in bgen file. Please check you have the correct phenotype file input. Note: Phenotype file should have the same number of samples as the bgen file and they should appear in the same order

Loading Genotype info from target

choishingwan commented 4 years ago

Have you got the command you used and the full log?

On Wed, 30 Sep 2020 at 3:37 AM, dorks81 notifications@github.com wrote:

Thank you Sam,

I ran my script with the file you provided above and received a different error. The used the same bgen files with v2.2.13 and no error was thrown. I ran the bgen file through QCTools -sample-stats and 80,000 samples were reported. The sample file contains 80,002 lines where the first two lines are ID_1 ID_2 MISSING and 0 0 0

Loading Genotype info from target

Error: Number of sample in phenotype file does not match

number of samples specified in bgen file. Please

check you have the correct phenotype file input.

Note: Phenotype file should have the same number of

samples as the bgen file and they should appear in

the same order

Loading Genotype info from target

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/234#issuecomment-700941012, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYQZDA4EWNEGTCQT4E3SIIZRJANCNFSM4R55NWYQ .

-- Dr Shing Wan Choi Postdoctoral Fellow Genetics and Genomic Sciences Icahn School of Medicine, Mount Sinai, NYC

dorks81 commented 4 years ago

Hi Sam,

Below is the script command followed by the log. Without the --allow-inter, the command works with v2.2.13

Rscript PRSice/PRSice-2.3.3/PRSice.R --prsice /PRSice/PRSice-2.3.3/PRSice_linux # new --base baseQC.csv --type bgen --allow-inter --target-list target.list --extract quickQC.snps.list --keep phenotype.csv --pheno phenotype.csv --pheno-col binary --cov covar.csv --cov-col SEX --thread 2 --binary-target T --stat OR --out binary

PRSice 2.3.4 (2020-09-01) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2020-09-30 09:27:32 ./PRSice_linux \ --a1 A1 \ --a2 A2 \ --allow-inter \ --bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 \ --base baseQC.csv \ --binary-target T \ --bp BP \ --chr CHR \ --clump-kb 250kb \ --clump-p 1.000000 \ --clump-r2 0.100000 \ --cov chr21.covar.csv \ --cov-col SEX \ --extract quickQC.snps.list \ --interval 5e-05 \ --keep phenotype.csv \ --lower 5e-08 \ --num-auto 22 \ --or \ --out binary \ --pheno phenotype.csv \ --pheno-col binary \ --pvalue P \ --seed 781449197 \ --snp SNP \ --stat OR \ --target-list target.list \ --thread 2 \ --type bgen \ --upper 0.5

Initializing Genotype info from file: target.list (bgen)

Start processing chr21.baseQC

Only one column detected, will assume only SNP ID is provided

Base file: /group/research/mvptest/TestData/PRSice/v2.2.8/plink_list/QC/chr21.baseQC.csv Header of file is: CHR BP SNP A1 A2 OR P

Reading 100.00% 133134 variant(s) observed in base file, with: 133134 total variant(s) included from base file

Loading Genotype info from target

Error: Number of sample in phenotype file does not match number of samples specified in bgen file. Please check you have the correct phenotype file input. Note: Phenotype file should have the same number of samples as the bgen file and they should appear in the same order

Error: Execution halted

choishingwan commented 4 years ago

You might want to provide the sample file in addition to the phenotype file as we don't actually know to read from the sample file when it isn't provided, thus rely completely on the phenotype file. --target-list target.list,

I should really make that error message clearer. Will put that in my to do list.

On Wed, Sep 30, 2020 at 10:06 PM dorks81 notifications@github.com wrote:

Hi Sam,

Below is the script command followed by the log. Without the --allow-inter, the command works with v2.2.13

Rscript PRSice/PRSice-2.3.3/PRSice.R --prsice /PRSice/PRSice-2.3.3/PRSice_linux # new --base baseQC.csv --type bgen --allow-inter --target-list target.list --extract quickQC.snps.list --keep phenotype.csv --pheno phenotype.csv --pheno-col binary --cov covar.csv --cov-col SEX --thread 2 --binary-target T --stat OR --out binary

PRSice 2.3.4 (2020-09-01) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2020-09-30 09:27:32 ./PRSice_linux --a1 A1 --a2 A2 --allow-inter --bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 --base baseQC.csv --binary-target T --bp BP --chr CHR --clump-kb 250kb --clump-p 1.000000 --clump-r2 0.100000 --cov chr21.covar.csv --cov-col SEX --extract quickQC.snps.list --interval 5e-05 --keep phenotype.csv --lower 5e-08 --num-auto 22 --or --out binary --pheno phenotype.csv --pheno-col binary --pvalue P --seed 781449197 --snp SNP --stat OR --target-list target.list --thread 2 --type bgen --upper 0.5

Initializing Genotype info from file: target.list (bgen) Start processing chr21.baseQC

Only one column detected, will assume only SNP ID is provided

Base file:

/group/research/mvptest/TestData/PRSice/v2.2.8/plink_list/QC/chr21.baseQC.csv Header of file is: CHR BP SNP A1 A2 OR P

Reading 100.00% 133134 variant(s) observed in base file, with: 133134 total variant(s) included from base file Loading Genotype info from target

Error: Number of sample in phenotype file does not match number of samples specified in bgen file. Please check you have the correct phenotype file input. Note: Phenotype file should have the same number of samples as the bgen file and they should appear in the same order

Error: Execution halted

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/234#issuecomment-701413708, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYUQ4PCPJLVEXIBXYULSIM3ODANCNFSM4R55NWYQ .

dorks81 commented 4 years ago

Hi Sam,

Thank you for the quick replies. v2.3.3 is now working with my list of bgen files with the --allow-inter flag. You were correct that I need to use --target-list target.list,bgen.sample in order for the command to work. Below is my script command

Rscript PRSice/PRSice-2.3.3/PRSice.R \ --prsice /PRSice/PRSice-2.3.3/PRSice_linux \ --base baseQC.csv \ --type bgen \ --allow-inter \ --target-list target.list,bgen.sample \ --extract quickQC.snps.list \ --keep phenotype.csv \ --pheno phenotype.csv \ --pheno-col binary \ --cov covar.csv \ --cov-col SEX \ --thread 2 \ --binary-target T \ --stat OR \ --out binary

dorks81 commented 3 years ago

Hello Sam,

Thank you for the help. Your suggestions did solve my issues with my analysis. Please go ahead and close this issue.