joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
582 stars 187 forks source link

Importing Biom Table generated by MEGAN #611

Open fconstancias opened 8 years ago

fconstancias commented 8 years ago

Dear All, I want to try your amayzing phyloseq package on a biom table generated by MEGAN. The format of my biom table is : "Biological Observation Matrix 1.0.0" and the type is "Taxon table". I did search but did not find anything about using MEGAN imported biom table for phyloseq.

Actually as MEGAN6 is not an OTU based approach I do not know if it is really suitable and it seems like my biom export at the species level only content the taxonomy of successfully assigned reads at the species level and omits reads that have matches at higher taxonomic level. But it is MEGAN6 export matters, as long as you do know it.

I just tried a basic import :

x = import_biom("data/Comparison-Taxonomy.biom") print(x) phyloseq-class experiment-level object otu_table() OTU Table: [ 422 taxa and 30 samples ] tax_table() Taxonomy Table: [ 422 taxa by 8 taxonomic ranks ] rank_names(x) [1] "Rank1" "Rank2" "Rank3" "Rank4" "Rank5" "Rank6" "Rank7" "Rank8" plot_bar(x, fill="Rank8") it gave my an expected plot, all black but with the number of reads per species that I exported from MEGAN.

Do you guys have any experience in loading biom table genereted through Megan6 or do I have create phyloseq Data Manually?

joey711 commented 8 years ago

I don't understand what the problem is. It looks like you imported the data just fine. What version are you using? Do you have a MRE that describes the problem you're trying to solve? You already imported the data and its represented as a phyloseq object. Why would you "create data manually"?

cjfields commented 6 years ago

In general I found that phyloseq imports MEGAN-generated BIOM files fine; the exported counts are for the nodes you select prior to exporting. I have run into an issue where some ranks seem to be mis-assigned (species ranks under family, etc) with BIOMs from phyloseq(), though this may be an idiosyncrasy with MEGAN and the default NCBI Taxonomy, which may be missing some intermediate ranks. If I can work out the source of the issue I'll file an issue in the appropriate place

AndreaQ7 commented 3 years ago

I have tried the same but the biom who came from MEGAN is not structured exactly as in phyloseq object, look at my taxa table when imported:

> head(tax_table(bbp))
Taxonomy Table:     [6 taxa by 7 taxonomic ranks]:
        Rank1         Rank2                            Rank3                        
2       "d__Bacteria" NA                               NA                           
95818   "d__Bacteria" "p__Candidatus Saccharibacteria" NA                           
1331051 "d__Bacteria" "p__Candidatus Saccharibacteria" "g__Candidatus Saccharimonas"
1332188 "d__Bacteria" "p__Candidatus Saccharibacteria" "g__Candidatus Saccharimonas"
976     "d__Bacteria" "p__Bacteroidetes"               NA                           
200643  "d__Bacteria" "p__Bacteroidetes"               "c__Bacteroidia"             
        Rank4                                      Rank5 Rank6 Rank7
2       NA                                         NA    NA    NA   
95818   NA                                         NA    NA    NA   
1331051 NA                                         NA    NA    NA   
1332188 "s__Candidatus Saccharimonas aalborgensis" NA    NA    NA   
976     NA                                         NA    NA    NA   
200643  NA                                         NA    NA    NA   

If you noticed there is a species "s__" at Rank 4 and dont know how to change it

cjfields commented 3 years ago

@AndreaQ7 that looks like a MEGAN bug or a setting within MEGAN for expected output. It's been a while since I've tested this, but it normally prompts for and outputs KPCOFGS (try doing this with tab-delim output to see if that pops up). Yours seem to be missing the 'OFG' bit.

AndreaQ7 commented 3 years ago

unfortunatly that's the same even with tab-delim

cjfields commented 3 years ago

@AndreaQ7 I would file an issue with the MEGAN developers:

http://megan.informatik.uni-tuebingen.de