Closed hershwin closed 3 years ago
Hi hershwin,
This is very likely a memory problem / walltime problem. If you are running on a server, how much resources did you give PRSice?
If giving more run time, more thread and more memory doesn't solve the problem, maybe can you download the latest code and compile it on your server with -march=native and see if that'd improve the situation?
(You will need g++ that support at least c++17)
git clone https://github.com/choishingwan/PRSice.git
cd PRSice
mkdir build
cd build
cmake -D march=ON ../
make
Please let me know if that solves the problem. Otherwise, I might have to revisit the full code to see what's going on as I don't think PRSice should fail with your sample size and number of SNPs given you've already give more than 200G of memory (based on previous threads)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi Sam, I have been struggling to use PRSice on BGEN files for a while now - I asked you some questions in previous threads (https://github.com/choishingwan/PRSice/issues/214 and https://github.com/choishingwan/PRSice/issues/221).
You recommended that I first convert bgen files to plink files before running PRSice. However, I am now getting the error 'execution halted' during the clumping process:
My log file: (I am crossing out the directory path)
PRSice 2.3.3 (2020-08-05) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2020-11-04 21:13:01 ./PRSice_linux \ --a1 ALLELE1 \ --a2 ALLELE0 \ --allow-inter \ --bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 \ --base /n/groups/xx/xx/xx/xx/fev1_inst1.stats.bgen \ --base-info INFO:0.9 \ --binary-target F \ --bp BP \ --chr CHR \ --clump-kb 250kb \ --clump-p 1.000000 \ --clump-r2 0.100000 \ --cov /n/groups/xx/xx/xx/xx/phenotypes_all.tab \ --cov-col @pc[1-40],f.50.0.0,agesq,heightsq,assesment_center,f.31.0.0 \ --cov-factor assesment_center,f.31.0.0 \ --extract /n/groups/xx/xx/xx/xx/PRSice/chrall_bgen.valid \ --interval 5e-05 \ --lower 5e-08 \ --num-auto 22 \ --out /n/xx/xx/xx/xx/xx/bgen \ --pheno /n/groups/xx/xx/xx/xx/phenotypes_all.tab \ --pheno-col fev1_inst1 \ --pvalue P_BOLT_LMM_INF \ --seed 3853770559 \ --snp SNP \ --stat BETA \ --target /n/xx/xx/xx/xx/bgen_converted/ukb# \ --thread 22 \ --upper 0.5
Warning: Intermediate not required. Will not generate intermediate file
Initializing Genotype file: /n/xx/xx/xx/xx/bgen_converted/ukb# (bed)
Start processing fev1_inst1.stats ==================================================
SNP extraction/exclusion list contains 5 columns, will assume first column contains the SNP ID
Base file: /n/groups/xx/xx/xx/xx/fev1_inst1.stats.bgen Header of file is:
SNP CHR BP GENPOS ALLELE1 ALLELE0 A1FREQ INFO CHISQ_LINREG P_LINREG BETA SE CHISQ_BOLT_LMM_INF P_BOLT_LMM_INF
19400443 variant(s) observed in base file, with: 8480089 variant(s) excluded based on user input 10920354 total variant(s) included from base file
Loading Genotype info from target ==================================================
487409 people (223006 male(s), 264318 female(s)) observed 487409 founder(s) included
82165946 variant(s) not found in previous data 9323 variant(s) with mismatch information 10920354 variant(s) included
Phenotype file: /n/groups/xx/xx/xx/xx/phenotypes_all.tab Column Name of Sample ID: FID+IID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected.
There are a total of 1 phenotype to process
Start performing clumping
Clumping Progress: 0.01%Error: Execution halted
Do you know why this may be? Thanks for your help!