Open isaacovercast opened 8 years ago
Cool! I've been wrestling with whether or not we should use dask, and also hdf5, but if scikit-allel uses both and we're going to use that then I'll feel free to use either in the code. I had been thinking that both might be particularly useful for reconstructing all of the coverage data into a vcf file in step 7.
Having popgen stats functions would be great. Some ideas that come to mind:
ipyrad.analysis
module that use scikit-allel to calculate RAD specific values. This would be nice because it provides simpler stats functions for users who do not want to spend time learning scikit-allel, numpy arrays, hdf5, etc. Hi guys, just wondering if this was ever implemented. Thanks! iPyrad rocks!!
Hello @vwishingrad, yeah we haven't implemented this yet. It's still a good idea though! It just hasn't happened yet.... Thanks for the positive feedback! Glad you like ipyrad. Tell your friends! Hopefully we will implement the sumstats analysis tool some time in the near future.... Thanks again for your feedback.
Well, now there's an ipyrad.analysis.popgen
tool. It's not perfect but it's a start. This will give you a bunch of information, still working on organizing it better:
workdir
per population for per site pi values within each locusTODO:
Hi, is it possible to use this tool at the moment? if so, how would one go about doing it? Thanks! and this would be an amazing tool once fully implemented!
Over the past couple days I've gotten pretty handy with scikit-allel, a nice python package for generating simple popgen summary stats. I think it would be relatively straightforward to include a block in step6 to write these out to a file/dir. If we have population assignment files then we can even do Fst and Dxy. Many people i have talked to have requested this feature, so i think it could add a lot of value for folks, lots of bang for the buck.