Open aguilar-gomez opened 1 year ago
Can you send the ngsLD
command you used?
ngsLD --n_threads 10 --geno CMScaffoldsofInterestMaf5.beagle.gz --n_ind 63 --n_sites 1145 --outH CMLDMaf5probs_thre --pos scaffoldsOfInterestmaf5.sites --max_kb_dist 0 --max_snp_dist 0 --extend_out --probs --call_geno --call_thresh .9 --min_maf 0.1 --ignore_miss_data
The problem is that site P_RNA_scaffold_10726:58826
has a lower freq (when calculated through the haplotypes) than --min_maf 0.1
. The idea was to exclude LD from sites with very low freq, but I guess it might be better to leave that filtering to the user.
Why is it differing the maf calculated by angsd vs the haplotype based calculation? I think it would be useful that position was completely excluded from the output, having the nan just makes it harder for plotting and interpreting. Or maybe having also a column with the calculated maf by haplotypes to understand what is happening.
The positions where maf
is lower than --min_maf
are excluded, but there is no filter on haplot freqs because these are calculated pairwise between all pairs of SNPs.
And you do have the haplot freqs in the extended output (hap_maf1
and hap_maf2
).
Hi,
I ran ngsLD and am interested in the r2 from the EM algorithm. When I am looking at the values for some positions, the D is negative, Dprime and r2 are nan. Do you know what is happening or how to fix it? I printed the extended output to see if that helps.