scikit-bio / scikit-bio

scikit-bio: a community-driven Python library for bioinformatics, providing versatile data structures, algorithms and educational resources.
https://scikit.bio
BSD 3-Clause "New" or "Revised" License
886 stars 268 forks source link

what does QIIME import from PyCogent? these are priorities for clean-up and port to bipy, imo #39

Closed gregcaporaso closed 10 years ago

gregcaporaso commented 10 years ago

@rob-knight put these notes together. We should use this document to prioritize porting functionality. Please request write access to the doc if you'd like to add notes, etc.

gregcaporaso commented 10 years ago

And the most common import from cogent in QIIME are (may be a little messy):

wasade commented 10 years ago

When MinimalFastqParser is shifted to this repo, I believe we should update it to handle casava to ensure that fastq data yielded are sane as early as possible. The current yielding of the ascii encoding means you have to think a bit more about how to get the data into a format you can actually use leading to different handling of qual from fastq already. MinimalFastqParser also differs from the datatype yielded by MinimalQualParser which is a numpy array. This adds a trivial amount of overhead but will likely not impact performance noticeably. The conceptual benefits I think outweigh any performance concerns here, and if there are concerns, we can push the parser to cython.

gregcaporaso commented 10 years ago

This discussion has shifted to the Boulder sprint spreadsheet.

mortonjt commented 8 years ago

Is there still an interest to have upgma?

ebolyen commented 8 years ago

No, SciPy already implements it as scipy.cluster.hierarchy.average. Additionally QIIME's PyCogent's UPGMA is actually subtly incorrect, we explored this early on in the beginning of skbio: https://github.com/biocore/qiime/issues/1541.

mortonjt commented 8 years ago

Ahhh, I did not see TreeNode.from_linkage_matrix. Thanks!!!