Closed anmwinter closed 10 years ago
Ara,
I don't have any way to reproduce the errors you're describing. There are lots of internal checks when phyloseq produces a new phyloseq data object, especially that OTU and sample indices match. You are ultimately responsible for checking that your data appears the way that it should, so you should check that the labels and sums of counts make sense. There are many functions in phyloseq to help you explore this, too many to list here. See the index of available functions in phyloseq. I typically start with things like
taxa_sums
sample_sums
variable_names
rank_names
But there are many others.
I'm sorry that the sample data that you added to the biom file was not deemed valid during import. This has been an ongoing issue with QIIME output to the biom format, and there's not much I can do about it. As far as I can tell there is no problem with the biom-format importer for R. Furthermore, since QIIME didn't include the sample data automatically, you might as well skip the step where you attempt to use a python script to add the sample data "after the fact". Just import the sample mapping file using phyloseq like you anyway did above.
The taxonomy warnings are from incomplete, missing, or wrong taxonomy entries in some of your data. The importer expects to find greengenes-formatted taxonomy entries, and complains if it doesn't. This doesn't mean there is anything wrong if you expect to have some missing entries. In some cases, people use the wrong parsing function to process the taxonomy entries, and so warnings of this kind during import are useful. For example, if the number of warnings equaled the number of OTUs, you would know for sure you had a problem.
It looks like you mainly wanted confirmation that you have done things "correctly". It looks fine to me so far.
joey
This is old one, but I think my comment might be helpful for others.
I had the same error when I imported biom file in phyloseq using the following command import_biom(BIOMfilename=BIOMfilename, treefilename = treefilename, parseFunction=parse_taxonomy_greengenes)
I got the following warning:
There were 47 warnings (use warnings() to see them)
Then I checked biom file quickly with the following bash command:
sed -re 's/\{/\n{/g' otu_table_filtered.biom |grep 'taxonomy'|grep 'Unassigned'|wc -l
And the output of the previous bash command was 47. So, it must be those Unassigned ones that are causing the import problem.
Joey and Michelle,
Thanks again for all the suggestions!
I went through the examples with GlobalPatterns and everything turned out fine. I am using the import_biom function two things are coming up.
1) The import_biom isn't pulling in my metadata from the rich biom file in qiime. I created this using the following command in macqiime:
biom add-metadata -i otu_table.biom -o rich_otu_table.biom --sample-metadata-fp VLmapping.csv
I checked the biom file in sublime text to verify that the sample data attached correctly. So to fix this I imported the mapping file and then merged it. Not a big deal just an extra line of code.
2) I am getting errors from the import_biom function. I used warnings() and intersect() but I admit I am not sure what intersect is doing. As you can see from the code below I can still call the villaluz phyloseq object and everything shows up. I can still plot alpha diversity and so on. Are these errors going to cause me any trouble?
OS X 10.7.x R 3.x RStudio 0.9x
QIIME 1.8 Created both two biom files. One with metadata and one with out. Same error shows up. Mapping file is the standard mapping file Taxonomy is assigned using gg_13_8. Both biom files have the gg taxonomy in them (verify in sublime text)
Downstream plot_richness and plot_bar by phyla works fine.
Thanks! ara