Open LarsOstman opened 1 year ago
What's the header of your pc file?
On Fri, Aug 18, 2023, 2:58 AM LarsOstman @.***> wrote:
Hello, I am trying to calculate a PRS-score, with PRSice2, on a case-control-cohort based on summary statistics from a larger GWAS-study. I have calculated principal components and want to use the first 6 PCs as covariates for the analysis. However, when I run the analysis I get the following error message:
Error: All samples removed due to missingness in covariate file!
I have made sure there aren't any hidden spaces in the covariates-file, I have tried to delimit with both tabs and spaces, and I have checked (and re-checked) that the path and the file-name are correct. However the same error-message keeps showing up.
Any help would be greatly appreciated, I will paste in the whole process below.
Thanks for a great product, Lars
@.***:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh PRSice 2.3.5 (2021-09-20) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2023-08-17 13:54:23 /home/laros/PRSice2/PRSice_linux --a1 A1 --a2 A2 --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 --base /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
--binary-target T --clump-kb 250kb --clump-p 1.000000 --clump-r2 0.100000 --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs --ignore-fid --interval 5e-05 --keep-ambig --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr --ld-keep /fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
--lower 1e-11 --num-auto 22 --or --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno --pheno-col MDD --pvalue P --score std --seed 3270214622 --snp MarkerName --stat LogOR --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC --thread 1 --upper 0.05
Warning: By selecting --keep-ambig, PRSice assume the base and target are reporting alleles on the same strand and will therefore only perform dosage flip for the ambiguous SNPs. If you are unsure of what the strand is, then you should not select the --keep-ambig option
Initializing Genotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed) Start processing PGC_UKB_depression_genome-wide
Base file:
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt Header of file is: MarkerName A1 A2 Freq LogOR StdErrLogOR P
Reading 100.00% 8483301 variant(s) observed in base file, with: 39487 NA stat/p-value observed 4210543 negative statistic observed. Maybe you have forgotten the --beta flag? 646120 ambiguous variant(s) 4233271 total variant(s) included from base file Loading Genotype info from target
92 people (0 male(s), 0 female(s)) observed 92 founder(s) included
4112097 variant(s) not found in previous data 43 variant(s) with mismatch information 522636 ambiguous variant(s) kept 3460831 variant(s) included
Initializing Genotype file: /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr (bed) Loading Genotype info from reference
2504 people (0 male(s), 0 female(s)) observed 503 founder(s) included
10540328 variant(s) not found in previous data 149 variant(s) with mismatch information 469778 ambiguous variant(s) kept 3104546 variant(s) included
Phenotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno Column Name of Sample ID: FID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 100.00% Number of variant(s) after clumping : 188356
Processing the 1 th phenotype
MDD is a binary phenotype 35 control(s) 57 case(s) Processing the covariate file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
Error: All samples removed due to missingness in covariate file!
— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/337, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi, Thank you for getting back to me!
The headers (and format) are as follows: FID IID PC1 PC2 PC3 PC4 PC5 PC6 F1 F1 -0.0488942 -0.00648387 0.0119713 0.0394345 -0.0165522 0.0617235 F2 F2 -0.0499371 0.0127898 0.0426918 0.0412524 -0.0538963 0.0342523 F4 F4 0.0154813 0.0156588 0.0044783 -0.00596863 -0.023635 0.00985086 F5 F5 -0.0147007 0.00670695 0.0355421 0.00302993 -0.0671668 -0.00930397 F6 F6 -0.0259049 -0.0069673 -0.0347271 -0.0398622 0.015978 0.0781486 F8 F8 -0.0345881 0.0205085 -0.0136661 0.0191272 -0.0209368 0.0631035 F9 F9 -0.0259158 0.0119127 0.0224861 0.0451637 -0.0516346 0.0112552
The columns are tab delimited in the file, but I’ve tried with space aswell and get the same error-message.
Thanks again, Lars
From: Shing Wan Choi @.> Sent: den 18 augusti 2023 14:04 To: choishingwan/PRSice @.> Cc: Lars Östman @.>; Author @.> Subject: Re: [choishingwan/PRSice] Issue with covariate-file (Issue #337)
What's the header of your pc file?
On Fri, Aug 18, 2023, 2:58 AM LarsOstman @.<mailto:@.>> wrote:
Hello, I am trying to calculate a PRS-score, with PRSice2, on a case-control-cohort based on summary statistics from a larger GWAS-study. I have calculated principal components and want to use the first 6 PCs as covariates for the analysis. However, when I run the analysis I get the following error message:
Error: All samples removed due to missingness in covariate file!
I have made sure there aren't any hidden spaces in the covariates-file, I have tried to delimit with both tabs and spaces, and I have checked (and re-checked) that the path and the file-name are correct. However the same error-message keeps showing up.
Any help would be greatly appreciated, I will paste in the whole process below.
Thanks for a great product, Lars
@.:/fenix/users/laros/ALF/Genetics/scripts$<mailto:@.:/fenix/users/laros/ALF/Genetics/scripts$> ./ALF_PRS_by_group.sh PRSice 2.3.5 (2021-09-20) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2023-08-17 13:54:23 /home/laros/PRSice2/PRSice_linux --a1 A1 --a2 A2 --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 --base /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
--binary-target T --clump-kb 250kb --clump-p 1.000000 --clump-r2 0.100000 --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs --ignore-fid --interval 5e-05 --keep-ambig --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr --ld-keep /fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
--lower 1e-11 --num-auto 22 --or --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno --pheno-col MDD --pvalue P --score std --seed 3270214622 --snp MarkerName --stat LogOR --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC --thread 1 --upper 0.05
Warning: By selecting --keep-ambig, PRSice assume the base and target are reporting alleles on the same strand and will therefore only perform dosage flip for the ambiguous SNPs. If you are unsure of what the strand is, then you should not select the --keep-ambig option
Initializing Genotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed) Start processing PGC_UKB_depression_genome-wide
Base file:
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt Header of file is: MarkerName A1 A2 Freq LogOR StdErrLogOR P
Reading 100.00% 8483301 variant(s) observed in base file, with: 39487 NA stat/p-value observed 4210543 negative statistic observed. Maybe you have forgotten the --beta flag? 646120 ambiguous variant(s) 4233271 total variant(s) included from base file Loading Genotype info from target
92 people (0 male(s), 0 female(s)) observed 92 founder(s) included
4112097 variant(s) not found in previous data 43 variant(s) with mismatch information 522636 ambiguous variant(s) kept 3460831 variant(s) included
Initializing Genotype file: /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr (bed) Loading Genotype info from reference
2504 people (0 male(s), 0 female(s)) observed 503 founder(s) included
10540328 variant(s) not found in previous data 149 variant(s) with mismatch information 469778 ambiguous variant(s) kept 3104546 variant(s) included
Phenotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno Column Name of Sample ID: FID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 100.00% Number of variant(s) after clumping : 188356
Processing the 1 th phenotype
MDD is a binary phenotype 35 control(s) 57 case(s) Processing the covariate file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
Error: All samples removed due to missingness in covariate file!
— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/337, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA . You are receiving this because you are subscribed to this thread.Message ID: @.<mailto:@.>>
— Reply to this email directly, view it on GitHubhttps://github.com/choishingwan/PRSice/issues/337#issuecomment-1683819700, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA. You are receiving this because you authored the thread.Message ID: @.**@.>>
Thought I'd add that it is just the .eigenvec output-file from the PC-analysis, which I haven't done any changes to.
Lars
Den 18 aug. 2023 14:04 skrev Shing Wan Choi @.***>:
What's the header of your pc file?
On Fri, Aug 18, 2023, 2:58 AM LarsOstman @.***> wrote:
Hello, I am trying to calculate a PRS-score, with PRSice2, on a case-control-cohort based on summary statistics from a larger GWAS-study. I have calculated principal components and want to use the first 6 PCs as covariates for the analysis. However, when I run the analysis I get the following error message:
Error: All samples removed due to missingness in covariate file!
I have made sure there aren't any hidden spaces in the covariates-file, I have tried to delimit with both tabs and spaces, and I have checked (and re-checked) that the path and the file-name are correct. However the same error-message keeps showing up.
Any help would be greatly appreciated, I will paste in the whole process below.
Thanks for a great product, Lars
@.***:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh PRSice 2.3.5 (2021-09-20) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2023-08-17 13:54:23 /home/laros/PRSice2/PRSice_linux --a1 A1 --a2 A2 --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 --base /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
--binary-target T --clump-kb 250kb --clump-p 1.000000 --clump-r2 0.100000 --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs --ignore-fid --interval 5e-05 --keep-ambig --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr --ld-keep /fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
--lower 1e-11 --num-auto 22 --or --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno --pheno-col MDD --pvalue P --score std --seed 3270214622 --snp MarkerName --stat LogOR --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC --thread 1 --upper 0.05
Warning: By selecting --keep-ambig, PRSice assume the base and target are reporting alleles on the same strand and will therefore only perform dosage flip for the ambiguous SNPs. If you are unsure of what the strand is, then you should not select the --keep-ambig option
Initializing Genotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed) Start processing PGC_UKB_depression_genome-wide
Base file:
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt Header of file is: MarkerName A1 A2 Freq LogOR StdErrLogOR P
Reading 100.00% 8483301 variant(s) observed in base file, with: 39487 NA stat/p-value observed 4210543 negative statistic observed. Maybe you have forgotten the --beta flag? 646120 ambiguous variant(s) 4233271 total variant(s) included from base file Loading Genotype info from target
92 people (0 male(s), 0 female(s)) observed 92 founder(s) included
4112097 variant(s) not found in previous data 43 variant(s) with mismatch information 522636 ambiguous variant(s) kept 3460831 variant(s) included
Initializing Genotype file: /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr (bed) Loading Genotype info from reference
2504 people (0 male(s), 0 female(s)) observed 503 founder(s) included
10540328 variant(s) not found in previous data 149 variant(s) with mismatch information 469778 ambiguous variant(s) kept 3104546 variant(s) included
Phenotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno Column Name of Sample ID: FID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 100.00% Number of variant(s) after clumping : 188356
Processing the 1 th phenotype
MDD is a binary phenotype 35 control(s) 57 case(s) Processing the covariate file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
Error: All samples removed due to missingness in covariate file!
— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/337, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
— Reply to this email directly, view it on GitHubhttps://github.com/choishingwan/PRSice/issues/337#issuecomment-1683819700, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA. You are receiving this because you authored the thread.Message ID: @.***>
You used ignore fid, and you have the fid column in your covariate file. In addition, as you did not specify the covariates, PRSice will use all non-ID fields, in this case the IID (default is the first column is id). Easy fix will be --cov-col @PC[1-6]
Sam
On Fri, Aug 18, 2023, 9:32 AM LarsOstman @.***> wrote:
Thought I'd add that it is just the .eigenvec output-file from the PC-analysis, which I haven't done any changes to.
Lars
Den 18 aug. 2023 14:04 skrev Shing Wan Choi @.***>:
What's the header of your pc file?
On Fri, Aug 18, 2023, 2:58 AM LarsOstman @.***> wrote:
Hello, I am trying to calculate a PRS-score, with PRSice2, on a case-control-cohort based on summary statistics from a larger GWAS-study. I have calculated principal components and want to use the first 6 PCs as covariates for the analysis. However, when I run the analysis I get the following error message:
Error: All samples removed due to missingness in covariate file!
I have made sure there aren't any hidden spaces in the covariates-file, I have tried to delimit with both tabs and spaces, and I have checked (and re-checked) that the path and the file-name are correct. However the same error-message keeps showing up.
Any help would be greatly appreciated, I will paste in the whole process below.
Thanks for a great product, Lars
@.***:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh PRSice 2.3.5 (2021-09-20) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2023-08-17 13:54:23 /home/laros/PRSice2/PRSice_linux --a1 A1 --a2 A2 --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 --base
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
--binary-target T --clump-kb 250kb --clump-p 1.000000 --clump-r2 0.100000 --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs --ignore-fid --interval 5e-05 --keep-ambig --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr --ld-keep
/fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
--lower 1e-11 --num-auto 22 --or --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno --pheno-col MDD --pvalue P --score std --seed 3270214622 --snp MarkerName --stat LogOR --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC --thread 1 --upper 0.05
Warning: By selecting --keep-ambig, PRSice assume the base and target are reporting alleles on the same strand and will therefore only perform dosage flip for the ambiguous SNPs. If you are unsure of what the strand is, then you should not select the --keep-ambig option
Initializing Genotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed) Start processing PGC_UKB_depression_genome-wide
Base file:
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
Header of file is: MarkerName A1 A2 Freq LogOR StdErrLogOR P
Reading 100.00% 8483301 variant(s) observed in base file, with: 39487 NA stat/p-value observed 4210543 negative statistic observed. Maybe you have forgotten the --beta flag? 646120 ambiguous variant(s) 4233271 total variant(s) included from base file Loading Genotype info from target
92 people (0 male(s), 0 female(s)) observed 92 founder(s) included
4112097 variant(s) not found in previous data 43 variant(s) with mismatch information 522636 ambiguous variant(s) kept 3460831 variant(s) included
Initializing Genotype file: /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr (bed) Loading Genotype info from reference
2504 people (0 male(s), 0 female(s)) observed 503 founder(s) included
10540328 variant(s) not found in previous data 149 variant(s) with mismatch information 469778 ambiguous variant(s) kept 3104546 variant(s) included
Phenotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno Column Name of Sample ID: FID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 100.00% Number of variant(s) after clumping : 188356
Processing the 1 th phenotype
MDD is a binary phenotype 35 control(s) 57 case(s) Processing the covariate file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
Error: All samples removed due to missingness in covariate file!
— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/337, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA>
. You are receiving this because you are subscribed to this thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub< https://github.com/choishingwan/PRSice/issues/337#issuecomment-1683819700>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>.
You are receiving this because you authored the thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/337#issuecomment-1683927684, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYSJDJRCML7MTMQF4GLXV5VGPANCNFSM6AAAAAA3VC7XBA . You are receiving this because you commented.Message ID: @.***>
Thank you so much, and I apologize for taking your time with such a simple answer. I'll fix it straight away. And just to see if I understand, would another solution be to remove the FID-column from the covariates-file? Since they would make IID the first column, and thus the default one?
Thank you once again!
Lars
Den 18 aug. 2023 15:46 skrev Shing Wan Choi @.***>:
You used ignore fid, and you have the fid column in your covariate file. In addition, as you did not specify the covariates, PRSice will use all non-ID fields, in this case the IID (default is the first column is id). Easy fix will be --cov-col @PC[1-6]
Sam
On Fri, Aug 18, 2023, 9:32 AM LarsOstman @.***> wrote:
Thought I'd add that it is just the .eigenvec output-file from the PC-analysis, which I haven't done any changes to.
Lars
Den 18 aug. 2023 14:04 skrev Shing Wan Choi @.***>:
What's the header of your pc file?
On Fri, Aug 18, 2023, 2:58 AM LarsOstman @.***> wrote:
Hello, I am trying to calculate a PRS-score, with PRSice2, on a case-control-cohort based on summary statistics from a larger GWAS-study. I have calculated principal components and want to use the first 6 PCs as covariates for the analysis. However, when I run the analysis I get the following error message:
Error: All samples removed due to missingness in covariate file!
I have made sure there aren't any hidden spaces in the covariates-file, I have tried to delimit with both tabs and spaces, and I have checked (and re-checked) that the path and the file-name are correct. However the same error-message keeps showing up.
Any help would be greatly appreciated, I will paste in the whole process below.
Thanks for a great product, Lars
@.***:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh PRSice 2.3.5 (2021-09-20) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2023-08-17 13:54:23 /home/laros/PRSice2/PRSice_linux --a1 A1 --a2 A2 --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 --base
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
--binary-target T --clump-kb 250kb --clump-p 1.000000 --clump-r2 0.100000 --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs --ignore-fid --interval 5e-05 --keep-ambig --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr --ld-keep
/fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
--lower 1e-11 --num-auto 22 --or --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno --pheno-col MDD --pvalue P --score std --seed 3270214622 --snp MarkerName --stat LogOR --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC --thread 1 --upper 0.05
Warning: By selecting --keep-ambig, PRSice assume the base and target are reporting alleles on the same strand and will therefore only perform dosage flip for the ambiguous SNPs. If you are unsure of what the strand is, then you should not select the --keep-ambig option
Initializing Genotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed) Start processing PGC_UKB_depression_genome-wide
Base file:
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
Header of file is: MarkerName A1 A2 Freq LogOR StdErrLogOR P
Reading 100.00% 8483301 variant(s) observed in base file, with: 39487 NA stat/p-value observed 4210543 negative statistic observed. Maybe you have forgotten the --beta flag? 646120 ambiguous variant(s) 4233271 total variant(s) included from base file Loading Genotype info from target
92 people (0 male(s), 0 female(s)) observed 92 founder(s) included
4112097 variant(s) not found in previous data 43 variant(s) with mismatch information 522636 ambiguous variant(s) kept 3460831 variant(s) included
Initializing Genotype file: /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr (bed) Loading Genotype info from reference
2504 people (0 male(s), 0 female(s)) observed 503 founder(s) included
10540328 variant(s) not found in previous data 149 variant(s) with mismatch information 469778 ambiguous variant(s) kept 3104546 variant(s) included
Phenotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno Column Name of Sample ID: FID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 100.00% Number of variant(s) after clumping : 188356
Processing the 1 th phenotype
MDD is a binary phenotype 35 control(s) 57 case(s) Processing the covariate file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
Error: All samples removed due to missingness in covariate file!
— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/337, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA>
. You are receiving this because you are subscribed to this thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub< https://github.com/choishingwan/PRSice/issues/337#issuecomment-1683819700>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>.
You are receiving this because you authored the thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/337#issuecomment-1683927684, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYSJDJRCML7MTMQF4GLXV5VGPANCNFSM6AAAAAA3VC7XBA . You are receiving this because you commented.Message ID: @.***>
— Reply to this email directly, view it on GitHubhttps://github.com/choishingwan/PRSice/issues/337#issuecomment-1683946186, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BB7XS3LBUDC3OLETF3DF5SLXV5W3VANCNFSM6AAAAAA3VC7XBA. You are receiving this because you authored the thread.Message ID: @.***>
Yes
On Fri, Aug 18, 2023, 10:06 AM LarsOstman @.***> wrote:
Thank you so much, and I apologize for taking your time with such a simple answer. I'll fix it straight away. And just to see if I understand, would another solution be to remove the FID-column from the covariates-file? Since they would make IID the first column, and thus the default one?
Thank you once again!
Lars
Den 18 aug. 2023 15:46 skrev Shing Wan Choi @.***>:
You used ignore fid, and you have the fid column in your covariate file. In addition, as you did not specify the covariates, PRSice will use all non-ID fields, in this case the IID (default is the first column is id). Easy fix will be --cov-col @PC[1-6]
Sam
On Fri, Aug 18, 2023, 9:32 AM LarsOstman @.***> wrote:
Thought I'd add that it is just the .eigenvec output-file from the PC-analysis, which I haven't done any changes to.
Lars
Den 18 aug. 2023 14:04 skrev Shing Wan Choi @.***>:
What's the header of your pc file?
On Fri, Aug 18, 2023, 2:58 AM LarsOstman @.***> wrote:
Hello, I am trying to calculate a PRS-score, with PRSice2, on a case-control-cohort based on summary statistics from a larger GWAS-study. I have calculated principal components and want to use the first 6 PCs as covariates for the analysis. However, when I run the analysis I get the following error message:
Error: All samples removed due to missingness in covariate file!
I have made sure there aren't any hidden spaces in the covariates-file, I have tried to delimit with both tabs and spaces, and I have checked (and re-checked) that the path and the file-name are correct. However the same error-message keeps showing up.
Any help would be greatly appreciated, I will paste in the whole process below.
Thanks for a great product, Lars
@.***:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh PRSice 2.3.5 (2021-09-20) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2023-08-17 13:54:23 /home/laros/PRSice2/PRSice_linux --a1 A1 --a2 A2 --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 --base
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
--binary-target T --clump-kb 250kb --clump-p 1.000000 --clump-r2 0.100000 --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs --ignore-fid --interval 5e-05 --keep-ambig --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr --ld-keep
/fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt
--lower 1e-11 --num-auto 22 --or --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno --pheno-col MDD --pvalue P --score std --seed 3270214622 --snp MarkerName --stat LogOR --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC --thread 1 --upper 0.05
Warning: By selecting --keep-ambig, PRSice assume the base and target are reporting alleles on the same strand and will therefore only perform dosage flip for the ambiguous SNPs. If you are unsure of what the strand is, then you should not select the --keep-ambig option
Initializing Genotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed) Start processing PGC_UKB_depression_genome-wide
Base file:
/fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt
Header of file is: MarkerName A1 A2 Freq LogOR StdErrLogOR P
Reading 100.00% 8483301 variant(s) observed in base file, with: 39487 NA stat/p-value observed 4210543 negative statistic observed. Maybe you have forgotten the --beta flag? 646120 ambiguous variant(s) 4233271 total variant(s) included from base file Loading Genotype info from target
92 people (0 male(s), 0 female(s)) observed 92 founder(s) included
4112097 variant(s) not found in previous data 43 variant(s) with mismatch information 522636 ambiguous variant(s) kept 3460831 variant(s) included
Initializing Genotype file: /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr (bed) Loading Genotype info from reference
2504 people (0 male(s), 0 female(s)) observed 503 founder(s) included
10540328 variant(s) not found in previous data 149 variant(s) with mismatch information 469778 ambiguous variant(s) kept 3104546 variant(s) included
Phenotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno Column Name of Sample ID: FID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 100.00% Number of variant(s) after clumping : 188356
Processing the 1 th phenotype
MDD is a binary phenotype 35 control(s) 57 case(s) Processing the covariate file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs
Error: All samples removed due to missingness in covariate file!
— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/337, or unsubscribe <
https://github.com/notifications/unsubscribe-auth/AAJTRYRD2WA3JMRMHKMFKATXV4HCTANCNFSM6AAAAAA3VC7XBA>
. You are receiving this because you are subscribed to this thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub<
https://github.com/choishingwan/PRSice/issues/337#issuecomment-1683819700>,
or unsubscribe<
https://github.com/notifications/unsubscribe-auth/BB7XS3I6HX4PPOVV7BPJMF3XV5K3HANCNFSM6AAAAAA3VC7XBA>.
You are receiving this because you authored the thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub < https://github.com/choishingwan/PRSice/issues/337#issuecomment-1683927684>,
or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAJTRYSJDJRCML7MTMQF4GLXV5VGPANCNFSM6AAAAAA3VC7XBA>
. You are receiving this because you commented.Message ID: @.***>
— Reply to this email directly, view it on GitHub< https://github.com/choishingwan/PRSice/issues/337#issuecomment-1683946186>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/BB7XS3LBUDC3OLETF3DF5SLXV5W3VANCNFSM6AAAAAA3VC7XBA>.
You are receiving this because you authored the thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/337#issuecomment-1683973674, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYTCYVWWHBTHA2EQXP3XV5ZGFANCNFSM6AAAAAA3VC7XBA . You are receiving this because you commented.Message ID: @.***>
Hello, I am trying to calculate a PRS-score, with PRSice2, on a case-control-cohort based on summary statistics from a larger GWAS-study. I have calculated principal components and want to use the first 6 PCs as covariates for the analysis. However, when I run the analysis I get the following error message:
Error: All samples removed due to missingness in covariate file!
I have made sure there aren't any hidden spaces in the covariates-file, I have tried to delimit with both tabs and spaces, and I have checked (and re-checked) that the path and the file-name are correct. However the same error-message keeps showing up.
Any help would be greatly appreciated, I will paste in the whole process below.
Thanks for a great product, Lars
laros@maul:/fenix/users/laros/ALF/Genetics/scripts$ ./ALF_PRS_by_group.sh PRSice 2.3.5 (2021-09-20) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2023-08-17 13:54:23 /home/laros/PRSice2/PRSice_linux \ --a1 A1 \ --a2 A2 \ --bar-levels 1e-05,5e-05,0.0001,0.0005,0.001,0.005,0.01,0.05,1 \ --base /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt \ --binary-target T \ --clump-kb 250kb \ --clump-p 1.000000 \ --clump-r2 0.100000 \ --cov /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs \ --ignore-fid \ --interval 5e-05 \ --keep-ambig \ --ld /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr \ --ld-keep /fenix/users/laros/Elefanten_gene/LD-data/1000genomes/1000Genomes_EURListPhase3.txt \ --lower 1e-11 \ --num-auto 22 \ --or \ --out /fenix/users/laros/Elefanten_gene/results/ALF_gene_by_group \ --pheno /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno \ --pheno-col MDD \ --pvalue P \ --score std \ --seed 3270214622 \ --snp MarkerName \ --stat LogOR \ --target /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC \ --thread 1 \ --upper 0.05
Warning: By selecting --keep-ambig, PRSice assume the base and target are reporting alleles on the same strand and will therefore only perform dosage flip for the ambiguous SNPs. If you are unsure of what the strand is, then you should not select the --keep-ambig option
Initializing Genotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.QC (bed)
Start processing PGC_UKB_depression_genome-wide ==================================================
Base file: /fenix/users/laros/Elefanten_gene/summary_stat/PGC_UKB_depression_genome-wide.txt Header of file is: MarkerName A1 A2 Freq LogOR StdErrLogOR P
Reading 100.00% 8483301 variant(s) observed in base file, with: 39487 NA stat/p-value observed 4210543 negative statistic observed. Maybe you have forgotten the --beta flag? 646120 ambiguous variant(s) 4233271 total variant(s) included from base file
Loading Genotype info from target ==================================================
92 people (0 male(s), 0 female(s)) observed 92 founder(s) included
4112097 variant(s) not found in previous data 43 variant(s) with mismatch information 522636 ambiguous variant(s) kept 3460831 variant(s) included
Initializing Genotype file: /fenix/users/laros/Elefanten_gene/LD-data/1kg_phase3.AllChr (bed)
Loading Genotype info from reference ==================================================
2504 people (0 male(s), 0 female(s)) observed 503 founder(s) included
10540328 variant(s) not found in previous data 149 variant(s) with mismatch information 469778 ambiguous variant(s) kept 3104546 variant(s) included
Phenotype file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.pheno Column Name of Sample ID: FID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 100.00% Number of variant(s) after clumping : 188356
Processing the 1 th phenotype
MDD is a binary phenotype 35 control(s) 57 case(s)
Processing the covariate file: /fenix/users/laros/ALF/Genetics/data/ALF_gene.PCs ==============================
Error: All samples removed due to missingness in covariate file!