Open adeflamingh opened 2 years ago
I'd try running it again with other seeds and, if that does not work, set --min_epsilon 1e-9
.
Thank you very much for your suggestions! I tried a different seed, and also tried --min_epsilon 1e-9 but I get the same error. Any other fixes?
What I used:
"ngsF-HMM --n_ind 55 --n_sites 20563944 --geno ../gl_55.beagle.gz --lkl --pos ../gl_55.nohead.pos --min_epsilon 1e-9 --out output/file --n_threads 8 --seed 8376"
What I get:
Header found! Skipping line... ==> Input Arguments: geno: ../gl_55.beagle.gz pos: ../gl_55.nohead.pos lkl: true loglkl: false n_ind: 55 n_sites: 20563944 call_geno: false freq: r freq_est: 1 e_prob: 1 indF: 0.01-0.001 indF_fixed: false alpha_fixed: false out: output/file log: 0 log_bin: false min_iters: 10 max_iters: 100 min_epsilon: 0.0000000010 n_threads: 8 verbose: 1 seed: 8376 version: 1.1.0 (Mar 14 2022 @ 11:09:52)
==> GZIP input file (not BINARY) ==> Reading data
Sites coordinates GENO data ==> Setting initial inbreeding values to: 0.01-0.001 ==> Using random initial frequency values. ==> Calculating initial emission probabilities
Iteration 1: ==> Forward Recursion ==> Backward Recursion Ind 0: -20723821.353790059685707 -20723821.352281708270311 (0.001508351415396)
: Numerical result out of range
Then it looks like a precision issues due to large datasets, which seems to be the case since you have over 20M snps.
You can try to run on just a single chromosome, to check if that is the case.
Alternatively, you can try a "gentle" LD prunning (e.g. using ngsLD
) just to remove sites that are in highly linked.
Let me know if it helped.
Hello, I'm trying to run ngs-HMM using GL files that were previously generated with ANGSD (this was the ANGSD command):
angsd -GL 1 -out gl_55 -nThreads 10 -doGlf 2 -doMajorMinor 1 -doMaf 2 -SNP_pval 1e-2 -minInd 28 -b bam.filelist -ref refXX.fasta
I have a beagle.gz and mafs.gz file, and a total number of 20563944 sites.
I saved the first two columns from the mafs file and removed the header to create the .pos file needed (cat X.mafs.gz | awk '{print $1,$3}' > X.pos, then tail -n +1 X.pos > X.nohead.pos)
I checked that the beagle and the no.head.pos files have 20563944 respectively using zcat X.beagle.gz | wc -l (this was 20563945, because of the header). and cat X.nohead.pos | wc -l)
Then I tried to use ngsF-HMM to calculate stats (assigning 260G memory and 8 threads), using:
ngsF-HMM --n_ind 55 --n_sites 20563944 --geno ../gl_55.beagle.gz --lkl --pos ../gl_55.nohead.pos --min_epsilon 1e-7 --out output/file --n_threads 8
I get following error:
Any recommendations on what I might be doing wrong? Thank you very much in advance!