biocore / qiime

Official QIIME 1 software repository. QIIME 2 (https://qiime2.org) has succeeded QIIME 1 as of January 2018.
GNU General Public License v2.0
285 stars 268 forks source link

implementation and integration of core data classes #1329

Open jairideout opened 10 years ago

jairideout commented 10 years ago

We need classes for many of the core types of data dealt with in QIIME. A lot of these classes may end up in scikit-bio or biom-format, but it'll be useful to get a list started here.

Once implemented, these new classes should first be used in the core QIIME scripts that are becoming pyqi-ized (see #1327) for the 1.9.0 release.

This issue takes precedence over #1327.

Initial list:

jairideout commented 10 years ago

Related to #1322.

josenavas commented 10 years ago

1169 is related with the ObservationMap item. Designing the in-memory and disk data structures at the same time can reduce performance issues.

As I pointed out in previous discussions, I think the ObservationMap should be a new standard format, like biom. This way, new clusterers can adopt it as a default output, reducing our overhead to include them on QIIME. Furthermore, we can also provide C/C++ and Python parsers making its usage easier for the community. For example, we can design it in order to support parallel access, most importantly on disk, but also in memory.

What others think? I'd like to work on this and start meeting with some people in order to design it.

jairideout commented 10 years ago

Also related to #1163