Rosemeis / pcangsd

Framework for analyzing low depth NGS data in heterogeneous populations using PCA.
GNU General Public License v3.0
46 stars 11 forks source link

How to resolve an apparent memory issue PCAngsd #63

Open BKrek89 opened 2 years ago

BKrek89 commented 2 years ago

Hello,

I'm trying to use pcangsd for pca and admix for a low-coverage WGS project. However, I keep running into the below error message.

61648 Bus error python3 /home/rek89/miniconda3/pkgs/pcangsd-0.98.2-py36h39e3cac_1/pcangsd/pcangsd.py -threads 10 -beagle sodalissnp.beagle.gz -admix -admix_save -admix_alpha 1 -o sodalissnps.admix slurmstepd: error: Detected 1 oom-kill event(s) in step 20689728.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.

The dataset is pretty large, containing 314 samples at ~3x coverage. Also, it works fine when I run it on a smaller data set from a separate project. The hpc I work on has a max of 250000 for compute nodes and that's what I have it set at. However, I'm new to computing so there could easily be something I'm missing.

Is there any way to get around this issue?

Thanks! -B

Rosemeis commented 2 years ago

Hi,

Sorry for the late reply. Have you performed any filtering on minor allele frequencies prior to running PCAngsd (in ANGSD)? And the 250000 you are referring to, is it the max memory in megabytes? If so then it should not be a problem at all to run PCAngsd. :-)

Best, Jonas

bishopia commented 2 years ago

Having a similar problem. I have applied maf filter prior to running pcangsd. is that a problem?

Rosemeis commented 2 years ago

Sorry for the late reply. No that should not be a problem. How many samples and sites do you have? :-)