Open vlrieg opened 3 years ago
Hi Val! Thanks for the suggestion. I'd love to implement Tajima's D at some point. Consider it "on the list". The issue with the scikit-allel function is that it likely uses its native implementation of pi (which is a component of the numerator of D), which we know can be inaccurate in many cases (see pixy paper). So we'd probably have to come up with something on our own, and then validate it using theory/sims. This might be a good starter project for a bioinformatically inclined student, actually. I'll leave this issue open as a reminder 👍
Hi @ksamuk, I implemented a version of Tajima's D that accounts for missing data. I tested it w/ the same scripts you used for the pixy publication, so hopefully a robust demonstration. The method uses the equations of Ferretti et al 2012 (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3416018/). I have not integrated this into the pixy scripts, so no PR atm, if you think it is OK method, then I can work on this. Third_pass_tajimas_d.pdf
Hi,
Was wondering if there had been any update to this? Any plans to include Tajima's D in future releases?
Hi @MilesLuca I havent made an effort to integrate code into the pixy base code. I am waiting to hear from @ksamuk whether this code addition met the standard and was a desired addition.
Hi Kieran,
Inspired by this previous feature request for Hudson's FST, I'd love to see Tajima's D implemented in Pixy so I can do all my summary stat calculations with the same (excellent) tool. Looks like scikit-allel has this worked out for windowed & single region calculations too: https://scikit-allel.readthedocs.io/en/stable/stats/diversity.html?#allel.tajima_d
Thanks for considering! Val