Closed varsh19 closed 1 year ago
Check if sex is encoded as numeric variable or not. If not, then make sure that is included in --cov-factor to tell prsice that it is not numeric. Otherwise, prsice treat it as missing
Sam
On Mon, Jun 26, 2023, 2:28 PM Varsha Srinivasan @.***> wrote:
Assigned #325 https://github.com/choishingwan/PRSice/issues/325 to @choishingwan https://github.com/choishingwan.
— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/325#event-9642394735, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYRGKE6UNSSPOZQZNWLXNHID5ANCNFSM6AAAAAAZURE2SQ . You are receiving this because you were assigned.Message ID: @.***>
Thank you so much; it worked!
Describe the bug I am running PRSice 2 with a phenotype file and a covariate file with age and sex as covariates. I have no FID and I am using --ignore-fid flag.
Error Log
PRSice 2.3.5 (2021-09-20) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2023-06-26 13:45:33 ./PRSice/PRSice_linux \ --a1 A1 \ --a2 A2 \ --bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 \ --base PGS000785.txt \ --beta \ --binary-target T \ --bp BP \ --chr CHR \ --cov cases_controls_new/covariates_white.txt \ --ignore-fid \ --interval 5e-05 \ --lower 5e-08 \ --no-clump \ --num-auto 22 \ --out PGS000785_results \ --pheno /home/vsrinivasan75/cases_controls_new/pheno_white.txt \ --pvalue P \ --seed 464394053 \ --snp SNP \ --stat BETA \ --target /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr#_v3,/home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr1_v3.sample \ --thread 1 \ --type bgen \ --upper 0.5
Initializing Genotype file: /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr#_v3 (bgen) With external fam file: /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr1_v3.sample
Start processing PGS000785 ==================================================
Base file: PGS000785.txt Header of file is: SNP CHR BP A1 A2 BETA allelefrequency_effect OR P
Reading 100.00% 103 variant(s) observed in base file, with: 9 ambiguous variant(s) excluded 94 total variant(s) included from base file
Loading Genotype info from target ==================================================
487409 people (222965 male(s), 264262 female(s)) observed 487409 founder(s) included
7402K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr1_v3.bgen
8129K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr2_v3.bgen
6696K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr3_v3.bgen
6555K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr4_v3.bgen
6070K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr5_v3.bgen
5751K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr6_v3.bgen
5405K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr7_v3.bgen
5282K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr8_v3.bgen
4066K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr9_v3.bgen
4562K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr10_v3.bgen
4628K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr11_v3.bgen
4431K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr12_v3.bgen
3270K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr13_v3.bgen
3037K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr14_v3.bgen
2767K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr15_v3.bgen
3089K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr16_v3.bgen
2660K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr17_v3.bgen
2599K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr18_v3.bgen
2087K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr19_v3.bgen
2082K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr20_v3.bgen
1261K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr21_v3.bgen
1255K SNPs processed in /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr22_v3.bgen
93095529 variant(s) not found in previous data 94 variant(s) included
Phenotype file: /home/vsrinivasan75/cases_controls_new/pheno_white.txt Column Name of Sample ID: IID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected.
There are a total of 1 phenotype to process
Processing the 1 th phenotype
PHENO is a binary phenotype 73947 sample(s) without phenotype 407733 control(s) 5729 case(s)
Processing the covariate file: cases_controls_new/covariates_white.txt ==============================
Error: All samples removed due to missingness in covariate file!
Error: Execution halted
To Reproduce This is the command I used: Rscript PRSice/PRSice.R --base PGS000785.txt --target /home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr#_v3,/home/sharedFolder/referenceData/ukb/imputed_genotypes/ukb_imp_chr1_v3.sample --type bgen --stat BETA --binary-target T --no-clump --ignore-fid --pheno /home/vsrinivasan75/cases_controls_new/pheno_white.txt --quantile 100 --quant-break 10,20,30,40,50,60,70,80,90,100 --out PGS000785_results --prsice PRSice/PRSice_linux --cov cases_controls_new/covariates_white.txt
Additional context I am performing the task with only cases of "white" ethnic background. So my phenotype contains information only for white people with colorectal cancer. I tried with 2 covariate files - One for white, and one for all individuals in the UK Biobank, but both yield the same error. I have run the tool without adding any covariates file and it works perfectly. Is the covariate file necessarily supposed to include PCs, because mine has just age and sex?