Open LeeBergstrand opened 7 months ago
There should be some of my code that uses it here: https://github.com/Micromeda/pygenprop/tree/master
@jmtsuji Thoughts?
@LeeBergstrand Thanks for this suggestion! I've used scikit-bio before for multivariate stats (e.g., PCoA), but I didn't realize that it had sequence manipulation built in as well. I'll take a look to see if this could replace biopython in this repo.
When I was at Waterloo, I moved my code from BioPython to scikit-bio. I found it much more performant because it uses C-based data structures like Numpy under the hood (the same techniques as pandas) instead of raw Python objects like BioPython. So, it runs faster and takes much less memory. I used it for micromeda and pygenprop to extract sequences from Fasta files and write new Fasta files.
People also find the code is much more stable: https://www.reddit.com/r/bioinformatics/comments/75xugl/scikitbio_why_does_it_exist/
@jmtsuji Do you have any interest in moving your code to use Scikit-Bio? I think it should be quite easy to port over.