joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
586 stars 187 forks source link

Problems met with ordination plot stacked barplot and using phyloseq package in R #1094

Open yjk555 opened 5 years ago

yjk555 commented 5 years ago

Dear friends: I am getting stuck at a problem while using phyloseq to plot an ordination plot, here are the codes and errors below: physeq <- qza_to_phyloseq("table.qza", "rooted-tree.qza", "taxonomy.qza", "sample_metadata.tsv", tmp = "C:\Users\46324\Desktop\bacteria") physeq

ord = ordinate(physeq, formula=~Treatments, "PCoA", "unifrac") Warning message: In matrix(tree$edge[order(tree$edge[, 1]), ][, 2], byrow = TRUE, : data length [4453] is not a sub-multiple or multiple of the number of rows [2227] g <- plot_ordination(physeq, ord, type="samples", color="Treatments") d <- g + stat_ellipse(geom = "polygon",type = "t", level = 0.95, alpha= 0.5, aes(fill = Treatments)) Regardless of this warning, I still got the an ordination plot but I think it does affect my portraying result.

Except for this, I met another tough problem with stacked barplot using phyloseq. Codes and problems are as follow: physeq1 = phyloseq(otu_table(physeq), tax_table(physeq), sample_data(physeq), phy_tree(physeq)) physeq2 = filter_taxa(physeq1, function(x) mean(x) > 0.1, TRUE) physeq3 = transform_sample_counts(physeq2, function(x) 100* x / sum(x)) head(otu_table(physeq3)) glom <- tax_glom(physeq3, taxrank="Phylum") Error in validObject(.Object) : invalid class “otu_table” object: OTU abundance data must have non-zero dimensions. Really appreciate someone who can help me with these problems, thanks a lot in advance if you can just provide some possible suggestions. Best Regards, Jiangkun

mikemc commented 5 years ago

I can't comment on the warning; it sounds like it has something to do with the phylogenetic tree and the unifrac calculation. Note, I don't think the formula option does anything with the PCoA method.

But for the error with tax_glom, it sounds like the error message is telling you that the number of taxa in physeq2 is 0, or will become zero after agglomerating. Either, filter_taxa is removing all of your taxa --- you should check if physeq2 has any taxa left --- tax_glom is giving 0 taxa because no taxa have a defined Phylum to glom on.

yjk555 commented 5 years ago

I can't comment on the warning; it sounds like it has something to do with the phylogenetic tree and the unifrac calculation. Note, I don't think the formula option does anything with the PCoA method.

But for the error with tax_glom, it sounds like the error message is telling you that the number of taxa in physeq2 is 0, or will become zero after agglomerating. Either, filter_taxa is removing all of your taxa --- you should check if physeq2 has any taxa left --- tax_glom is giving 0 taxa because no taxa have a defined Phylum to glom on.

I can't comment on the warning; it sounds like it has something to do with the phylogenetic tree and the unifrac calculation. Note, I don't think the formula option does anything with the PCoA method.

But for the error with tax_glom, it sounds like the error message is telling you that the number of taxa in physeq2 is 0, or will become zero after agglomerating. Either, filter_taxa is removing all of your taxa --- you should check if physeq2 has any taxa left --- tax_glom is giving 0 taxa because no taxa have a defined Phylum to glom on.

Thanks a lot for your reply, there are a lot of taxa in physeq2, you can see as follow, seems the problem is not as you mentioned.

> physeq2

phyloseq-class experiment-level object otu_table() OTU Table: [ 1570 taxa and 20 samples ] sample_data() Sample Data: [ 20 samples by 3 sample variables ] tax_table() Taxonomy Table: [ 1570 taxa by 7 taxonomic ranks ] phy_tree() Phylogenetic Tree: [ 1570 tips and 1566 internal nodes ] Thank you again for your help. Kind Regards, Jiangkun

mikemc commented 5 years ago

It will be hard for anyone to figure out the error without more information; we can only guess at common issues. If you can upload an .Rds file of your "physeq" or "physeq1" objects then I can take a look at both problems.

yjk555 commented 5 years ago

Fine, here are the files of objects of physeq, including "table.qza", "rooted-tree.qza", "taxonomy.qza", "sample_metadata.tsv", thank you very much for your help, looking forward to your further information.

physeq <- qza_to_phyloseq("table.qza", "rooted-tree.qza", "taxonomy.qza", "sample_metadata.tsv", tmp = "C:\Users\46324\Desktop\bacteria")

physeq phyloseq-class experiment-level object otu_table() OTU Table: [ 2229 taxa and 20 samples ] sample_data() Sample Data: [ 20 samples by 3 sample variables ] tax_table() Taxonomy Table: [ 2229 taxa by 7 taxonomic ranks ] phy_tree() Phylogenetic Tree: [ 2229 tips and 2225 internal nodes ]

Just let me know what you need to solve my problem, really appreciate your help. Jiangkun

objects in physeq.zip

mikemc commented 5 years ago

The unzip program says this file isn't a valid zip file. How did you create it?

Can you instead run

saveRDS(physeq, "physeq.Rds")

and send the saved Rds file?

Edit: Nevermind, I was able to extract it with 7-Zip. It seems that it is a Rar5 file rather than a Zip file.

yjk555 commented 5 years ago

The file type you suggested is not supported, so I zipped my files again, here is the zip file, and it should work this time. Many thanks, Files.zip

Jiangkun

mikemc commented 5 years ago

Ok, I was able to load the files in after installing the qiime2R package. The tax_table is filled with NAs and this is resulting in 0 taxa after you use tax_glom, as I suggested above. You can confirm by running

head(tax_table(physeq))
yjk555 commented 5 years ago

Yes, there are some NAs, but I am quite sure the tax_table is not filled with NAs, you can run tax_table(physeq), and maybe the top OTUs are showing NAs but not all like this, there are a lot of OTUs assigned and has taxonomic information. Thank you again for your suggestion. Jiangkun

mikemc commented 5 years ago

Please examine tax_table(physeq) or tax_table(physeq3). On my computer, when I import the data as you have done using qiime2R::qza_to_phyloseq(), the taxonomic assignments are not parsed correctly:

tax_table(physeq3)[1:10, "Kingdom"]
#> Taxonomy Table:     [10 taxa by 1 taxonomic ranks]:
#>                                  Kingdom                                                                                                                                          
#> 0ccfe77fad591c9fe34fad902450e417 "Unassigned"                                                                                                                                     
#> 0b52dae95ddd1cfb4b6ebff9ba7e3f3d "Unassigned"                                                                                                                                     
#> d555ff0d241d773168bb0a874e52038b "Unassigned"                                                                                                                                     
#> 633833bc75a18cc63862f50c8a718f91 "Unassigned"                                                                                                                                     
#> 5d835f6df94b789811ac1faaf7c56361 "Unassigned"                                                                                                                                     
#> 342a010262f35cb7aa5a97a0ef1008b1 "D_0__Bacteria"                                                                                                                                  
#> 293090ba0a4eaf5c9f11839a755556ea "D_0__Bacteria;D_1__Synergistetes;D_2__Synergistia;D_3__Synergistales;D_4__Synergistaceae;D_5__Pyramidobacter;D_6__uncultured rumen bacterium"   
#> 73f8fd64d763ebfd3a85a42a8c13917f "D_0__Bacteria;D_1__Synergistetes;D_2__Synergistia;D_3__Synergistales;D_4__Synergistaceae;D_5__Pyramidobacter;D_6__uncultured rumen bacterium"   
#> 4fec150f29cd374a2a705a7fb5ab319a "D_0__Bacteria;D_1__Synergistetes;D_2__Synergistia;D_3__Synergistales;D_4__Synergistaceae;D_5__Pyramidobacter;D_6__uncultured Pyramidobacter sp."
#> 683a125bc3ef36944132fa5bd4eac93b "D_0__Bacteria"                                                                                                                                  

and so all Phylum are NA:

taxmat <- as(tax_table(physeq3), "matrix")
table(taxmat[,"Phylum"], useNA =  "always")
#> .
#> <NA> 
#> 1570 

This leads exactly to the error that you have shown, because the tax_glom function by default discards all taxa since all taxa have Phylum as NA. You will need to get the taxonomy strings correctly parsed first. The import function you are using is not part of phyloseq, and if this is what is happening to you, this is not a phyloseq bug but a problem with how you are exporting and importing the taxonomy data. You might look into the data import tutorial here https://joey711.github.io/phyloseq/import-data.html and consider if you can use a different qiime output format as suggested there.