Error when inputing bgen files

chenpunk commented 5 months ago

Hi, I'm trying to generate PRS score based on pre-qc&clumped snplist in UKBiobank data using PRSice-2 and encounter the problem when reading bgen files.

My log here: PRSice 2.3.5 (2021-09-20) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2024-03-29 04:19:50 ./PRSice \ --a1 A1 \ --a2 A2 \ --bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 \ --base result.txt \ --base-info INFO:0.8 \ --beta \ --binary-target F \ --chr CHR \ --clump-kb 250kb \ --clump-p 1.000000 \ --clump-r2 0.100000 \ --interval 5e-05 \ --lower 5e-08 \ --num-auto 22 \ --out PRSice \ --pvalue P \ --seed 573448311 \ --snp SNP \ --stat BETA \ --thread 1 \ --upper 0.5

Error: You must provide a target file or a file containing all target prefixs!

However, I think I have provided the path of bgen files and my original codes reads: Rscript ./PRSice.R \ --dir . \ --prsice ./PRSice \ --base result.txt \ --base-info INFO:0.8 \ --stat BETA \ --type bgen \ --target /mnt/project/Bulk/Imputation/'UKB imputation from genotype'/ukb22828_c#_b0_v3.bgen,/mnt/project/Bulk/Imputation/'UKB imputation from genotype'/ukb22828_c#_b0_v3.sample \ --keep ./PRS_ID.txt \ --remove ./rel.king.cutoff.in.id \ --extract qc.clumps \ --keep-ambig \ --info 0.8 \ --pheno PRS_phe.txt \ --cov PRS_cov.txt \ --cov-factor sex,income \ --thread max \ --no-clump \ --quantile 10 \ --out BD

I'm not sure what might be the problem causing this. I would be very appreciated if you can give me some advice. Thank you!

zhilongjia commented 2 months ago

It seems PRSice does not replace the # with the actual chromosome ID in the sample file of bgen, though it replaces the # with the actual chromosome ID for the bgen file in the target parameter.

The help is for the binary target file of plink: "If the binary file is separated into individual chromosomes, then an # can be used to specify the location of the chromosome number in the file name. PRSice will automatically substitute # with 1-22"

--type bgen --target $HOME/ukb2/UKB_genetic/bgen/ukb_c#_b0_v3,$HOME/ukb2/UKB_genetic/bgen/ukb_c#_b0_v3.sample

Initializing Genotype file: ~/ukb2/UKB_genetic/bgen/ukb_c#_b0_v3 (bgen) With external fam file: ~/ukb2/UKB_genetic/bgen/ukb_c#_b0_v3.sample

Error: Cannot open file: ~/ukb2/UKB_genetic/bgen/ukb_c#_b0_v3.sample

choishingwan commented 2 months ago

It only replace those in the begin file, not the sample file

Sam

On Sun, Jul 14, 2024 at 8:57 PM Zhilong @.***> wrote:

It seems PRSice does not replace the # with the actual chromosome ID in the sample file of bgen, though it replaces the # with the actual chromosome ID for the bgen file in the target parameter.

The help is for the binary target file of plink: "If the binary file is separated into individual chromosomes, then an # can be used to specify the location of the chromosome number in the file name. PRSice will automatically substitute # with 1-22"

--type bgen --target $HOME/ukb2/UKB_genetic/bgen/ukb_c#_b0_v3,$HOME/ukb2/UKB_genetic/bgen/ukb_c#_b0_v3.sample

Initializing Genotype file: ~/ukb2/UKB_genetic/bgen/ukb_c#_b0_v3 (bgen) With external fam file: ~/ukb2/UKB_genetic/bgen/ukb_c#_b0_v3.sample

Error: Cannot open file: ~/ukb2/UKB_genetic/bgen/ukb_c#_b0_v3.sample

— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/354#issuecomment-2227554876, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYSUNFMY44EBG6G4EJ3ZMMM7XAVCNFSM6AAAAABFN4ZUA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRXGU2TIOBXGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

zhilongjia commented 2 months ago

The fact is that there is one sample file per bgen file. How to implement PRS calculation for all the chromosomes? Or, PRS is calculated per chromesome, and how to merge the results per chromosome? Thank you.

choishingwan commented 2 months ago

Each sample file should be identical

Sam

On Mon, Jul 15, 2024 at 2:51 AM Zhilong @.***> wrote:

The fact is that there is one sample file per bgen file. How to implement PRS calculation for all the chromosomes? Or, PRS is calculated per chromesome, and how to merge the results per chromosome? Thank you.

— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/354#issuecomment-2227807038, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYRKL2ZHNYC64IMSLU3ZMNWPXAVCNFSM6AAAAABFN4ZUA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRXHAYDOMBTHA . You are receiving this because you commented.Message ID: @.***>

chenpunk commented 2 months ago

I have bypassed the issue by using bed files after qc and clumping. And thank you guys for the advice and clarification. I think replacing the # in ukb22828_c#_b0_v3.sample with number between 1 and 22 should solve my problem. Thanks again!

choishingwan / PRSice

Error when inputing bgen files #354