choishingwan / PRSice

A software package for calculating, applying, evaluating and plotting the results of polygenic risk scores
http://prsice.info
GNU General Public License v3.0
182 stars 86 forks source link

Differences in 2.2.11 vs. 2.2.12 - Variants filtered due to "mismatch information" #185

Closed nannabarnkob closed 4 years ago

nannabarnkob commented 4 years ago

Hi PRSice-2 team.

I have some questions about differences in PRSice 2.2.12 (2020-02-20) and PRSice 2.2.11.b (2019-10-16). I tried to have a look in the update logs and in the commits but didn't have much luck. I tried to run twice PRSice with mostly default settings but in the new version, all my variants are removed due to mismatch information. Can you help me with why that is or what is going on? I am also unsure about what "mismatch" actually refers to.

Here are logs of my runs:

New version:

PRSice 2.2.12 (2020-02-20)
https://github.com/choishingwan/PRSice
(C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly
GNU General Public License v3
If you use PRSice in any published work, please cite:
Choi SW, O'Reilly PF.
PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data.
GigaScience 8, no. 7 (July 1, 2019)
2020-02-26 17:29:46
/data/mvn373-scratch/prs-repo/code_nanna/PRSice2_2_12/PRSice_linux \
    --bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 \
    --base WANG_WHRadj_combined.ma \
    --beta  \
    --binary-target F \
    --clump-kb 250kb \
    --clump-p 1.000000 \
    --clump-r2 0.100000 \
    --interval 5e-05 \
    --lower 5e-08 \
    --num-auto 22 \
    --out PRSice \
    --pvalue p \
    --seed 3542501148 \
    --stat b \
    --target UKBBK_females \
    --thread 1 \
    --upper 0.5

Initializing Genotype file: UKBBK_females (bed)

Start processing WANG_WHRadj_combined
==================================================

Reading 100.00%
Base file: WANG_WHRadj_combined.ma
5554549 variant(s) observed in base file, with:
861191 ambiguous variant(s) excluded
4693358 total variant(s) included from base file

Loading Genotype info from target
==================================================

1747 people (0 male(s), 1747 female(s)) observed
1747 founder(s) included

88402265 variant(s) not found in previous data
4693358 variant(s) with mismatch information
0 variant(s) included

Error: No vairant remained!

Old version:

https://github.com/choishingwan/PRSice
(C) 2016-2019 Shing Wan (Sam) Choi and Paul F. O'Reilly
GNU General Public License v3
If you use PRSice in any published work, please cite:
Choi SW, O'Reilly PF.
PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data.
GigaScience 8, no. 7 (July 1, 2019)
2020-02-26 17:35:14
/data/mvn373-scratch/prs-repo/code_nanna//PRSice_linux \
    --bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 \
    --base WANG_WHRadj_combined.ma \
    --beta  \
    --binary-target F \
    --clump-kb 250 \
    --clump-p 1.000000 \
    --clump-r2 0.100000 \
    --interval 5e-05 \
    --lower 5e-08 \
    --out PRSice \
    --pvalue p \
    --seed 1177416519 \
    --stat b \
    --target UKBBK_females \
    --thread 1 \
    --upper 0.5

Initializing Genotype file: UKBBK_females (bed)

Start processing WANG_WHRadj_combined
==================================================

Reading 100.00%
Base file: WANG_WHRadj_combined.ma
5554549 variant(s) observed in base file, with:
861191 ambiguous variant(s) excluded
4693358 total variant(s) included from base file

Loading Genotype info from target
==================================================

1747 people (0 male(s), 1747 female(s)) observed
1747 founder(s) included

88402265 variant(s) not found in previous data
4693358 variant(s) included

There are a total of 1 phenotype to process

Start performing clumping

1018721 MB RAM detected; reserving 4 MB for clumping

Allocated 4 MB successfully

Clumping Progress: 100.00%[[B

Number of variant(s) after clumping : 134333

Processing the 1 th phenotype
Phenotype is a continuous phenotype
1747 sample(s) with valid phenotype

Preparing Output Files

Start Processing
Processing 100.00%
There are 1 region(s) with p-value between 0.1 and 1e-5
(may not be significant);

Thank you for your help!

choishingwan commented 4 years ago

Ya, that's a bug when you don't have the --chr and --bp provided. I have solved this problem but are encountering another issue at the moment. Will do a quick release soon after I fixed that. Dr Shing Wan Choi Postdoctoral Fellow Genetics and Genomic Sciences Icahn School of Medicine, Mount Sinai, NYC

On Thu, Feb 27, 2020 at 3:49 AM nannabarnkob notifications@github.com wrote:

Hi PRSice-2 team.

I have some questions about differences in PRSice 2.2.12 (2020-02-20) and PRSice 2.2.11.b (2019-10-16). I tried to have a look in the update logs and in the commits but didn't have much luck. I tried to run twice PRSice with mostly default settings but in the new version, all my variants are removed due to mismatch information. Can you help me with why that is or what is going on? I am also unsure about what "mismatch" actually refers to.

Here are logs of my runs:

New version:

PRSice 2.2.12 (2020-02-20)https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2020-02-26 17:29:46 /data/mvn373-scratch/prs-repo/code_nanna/PRSice2_2_12/PRSice_linux \ --bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 \ --base WANG_WHRadj_combined.ma \ --beta \ --binary-target F \ --clump-kb 250kb \ --clump-p 1.000000 \ --clump-r2 0.100000 \ --interval 5e-05 \ --lower 5e-08 \ --num-auto 22 \ --out PRSice \ --pvalue p \ --seed 3542501148 \ --stat b \ --target UKBBK_females \ --thread 1 \ --upper 0.5

Initializing Genotype file: UKBBK_females (bed)

Start processing WANG_WHRadj_combined

Reading 100.00% Base file: WANG_WHRadj_combined.ma 5554549 variant(s) observed in base file, with: 861191 ambiguous variant(s) excluded 4693358 total variant(s) included from base file

Loading Genotype info from target

1747 people (0 male(s), 1747 female(s)) observed 1747 founder(s) included

88402265 variant(s) not found in previous data 4693358 variant(s) with mismatch information 0 variant(s) included

Error: No vairant remained!

Old version:

https://github.com/choishingwan/PRSice (C) 2016-2019 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2020-02-26 17:35:14 /data/mvn373-scratch/prs-repo/code_nanna//PRSice_linux \ --bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 \ --base WANG_WHRadj_combined.ma \ --beta \ --binary-target F \ --clump-kb 250 \ --clump-p 1.000000 \ --clump-r2 0.100000 \ --interval 5e-05 \ --lower 5e-08 \ --out PRSice \ --pvalue p \ --seed 1177416519 \ --stat b \ --target UKBBK_females \ --thread 1 \ --upper 0.5

Initializing Genotype file: UKBBK_females (bed)

Start processing WANG_WHRadj_combined

Reading 100.00% Base file: WANG_WHRadj_combined.ma 5554549 variant(s) observed in base file, with: 861191 ambiguous variant(s) excluded 4693358 total variant(s) included from base file

Loading Genotype info from target

1747 people (0 male(s), 1747 female(s)) observed 1747 founder(s) included

88402265 variant(s) not found in previous data 4693358 variant(s) included

There are a total of 1 phenotype to process

Start performing clumping

1018721 MB RAM detected; reserving 4 MB for clumping

Allocated 4 MB successfully

Clumping Progress: 100.00%[[B

Number of variant(s) after clumping : 134333

Processing the 1 th phenotype Phenotype is a continuous phenotype 1747 sample(s) with valid phenotype

Preparing Output Files

Start Processing Processing 100.00% There are 1 region(s) with p-value between 0.1 and 1e-5 (may not be significant);

Thank you for your help!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/185?email_source=notifications&email_token=AAJTRYSS37N4TQBSAWALUC3RE55C3A5CNFSM4K4WQADKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IQWXU2A, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYUXZ6KEQQ2MS5IWUMLRE55C3ANCNFSM4K4WQADA .

nannabarnkob commented 4 years ago

All right, thank you for the quick response!

Nanna

choishingwan commented 4 years ago

Should be fixed in 2.2.13