Closed snunzi closed 7 years ago
Could you give me some details such as what allele module (short, long, etc), estimated number of individuals, chromosome, loci, and fraction of monomorphic sites? If you only have 2 alleles, binary module could help, if you have mostly rare variants, mutant module could help. To switch between modules, you could save the population and load it in another module.
Also, you could potentially skip the filtering process by exporting only genotype at specified loci...
Binary mode helped a lot, thank you!
Hi Bo, I am currently filtering monomorphic sites out of my simulated sequences before export, and it is going extremely slow due to high number of individuals and very high amount of sequence data. I have been using the code below, which works for small data sets, but not my larger ones. Do you have any suggestions to filter out monomorphic sites more efficiently? Many thanks.
-Schyler
thresh_hi=0.999999 thresh_lo=0.000001 lociToRemove = [l for l in xrange(pop.totNumLoci()) if pop.dvars().alleleFreq[l][0] > thresh_hi or pop.dvars().alleleFreq[l][0] < thresh_lo] pop.removeLoci(lociToRemove)