biocore / qiime

Official QIIME 1 software repository. QIIME 2 (https://qiime2.org) has succeeded QIIME 1 as of January 2018.
GNU General Public License v2.0
285 stars 269 forks source link

compute_core_microbiome.py error #2081

Open mortonjt opened 8 years ago

mortonjt commented 8 years ago

There was a concern on the qiime forums that compute_core_microbiome.py fails when there is no observation metadata (i.e. taxonomy) present.

Should having this metadata be a strict requirement to run? If so, would it be desirable to have this script send an informative error if there is no observation metadata present?

colinbrislawn commented 8 years ago

No. Metadata embedded in the .biom file, including taxonomy, should not be a requirement.

The issue is that the script uses the value of the --otu_md flag, regardless if it's passed or not. So when no metadata is present, --otu_md defaults to 'taxonomy' and the following line attempts to pull this non-existent taxonomy from the OTU table. https://github.com/biocore/qiime/blob/master/scripts/compute_core_microbiome.py#L150-L153

Before this loop, the script should check if a metadata column in part of the metadata table.

Someone assign this to me. I'll fix this a write a unit test to make sure that empty taxonomy works.

Colin

gregcaporaso commented 8 years ago

@colinbrislawn, I'll assign this one to you - just sent you an invitation to join the qiime-developers team on GitHub and I can assign after you accept.

I agree with your idea here. Note that this fix would only end up going into a release if we end up doing a 1.9.2 release, and that will likely only happen if we end up finding a major bug. Just wanted to let you know before you spend time on it.

colinbrislawn commented 8 years ago

Thanks for inviting me to biocore! I'm honored.

I acknowledge this small patch may not see a release because qiime 1.9.1 is in LTS. I'm fine with that. I'll patch this if others have issues.

gregcaporaso commented 8 years ago

Thanks! Assigned to you.

gregcaporaso commented 8 years ago

Note: this also fails on trying to write empty biom tables (e.g., at levels where no core microbiome exists). In general this functionality should be refactored for QIIME 2, and it would be very useful if it also collapsed taxa if present (so it could tell you what the core phyla, classes, etc are in a single run, along with the core OTUs).