Closed mpkol59 closed 3 weeks ago
Hi - I'd try restricting the job to a single node and a single thread. Long running time is often caused by your job interfering with other jobs running on the same node.
export MKL_NUM_THREADS=1 export NUMEXPR_NUM_THREADS=1 export OMP_NUM_THREADS=1
The running time is independent of the GWAS sample size. For chromosome 22, the computation should be finished in around 30 min.
Thank you for your swift response.
Hello @getian107 , Thank you very much for your great work with this software. I have been experiencing some issues with long running time for PRS-CSx. I am using summary statistics of African and European populations as the summary statistics, with AFR as the LD reference and an AFR genotype data for the bim file. I do have a large sample size in the EUR but I have limited the run to just chromosome 22, thinking this would make the job finish faster, however, it has been days now, and the run is stuck at the MCMC iterations. Do you think I am missing something in terms of optimizing the usage? Here is my code below, I would appreciate and look forward to your response.
!/bin/bash
SBATCH -A depot
SBATCH -N 2
SBATCH -n 50
SBATCH --time=14:00:00:00
SBATCH --job-name afrPRScsx
export MKL_NUM_THREADS=10 export NUMEXPR_NUM_THREADS=10 export OMP_NUM_THREADS=10
cd /path/path
module load anaconda module load use.own module use PRScsx
for i in {22..22}; do \ PRScsx/PRScsx.py \ --ref_dir=/path/ldblk_ukbb_afr \ --bim_prefix=/path/cafr_chr$i \ --sst_file=eur_chr22.txt,afr_chr22.txt \ --n_gwas=233204,31317 --pop=EUR,AFR --chrom=$i --phi=1e-2 --seed=999 \ --out_dir=/path/path \ --out_name=AFR_EURCSX22chr$i \ ; done
Best wishes.