david-barnett / microViz

R package for microbiome data visualization and statistics. Uses phyloseq, vegan and the tidyverse. Docker image available.
https://david-barnett.github.io/microViz/
GNU General Public License v3.0
94 stars 10 forks source link

tax.fix(), and "Problematic Genus values detected in tax_table" #150

Closed Nekovirus closed 1 month ago

Nekovirus commented 2 months ago

Hello!

I am having an issue with the tax.fix() command. Using the command like this: phyloseq object %>% tax_fix( minlength = 4, unknowns = c("."), #Unknowns in taxon table are represented by "." I also tried unknowns ="." sep = "", anon_unique = TRUE, suffix_rank = "current" ) Yields absolutely no results. I have to instead do: new_tax_table %>% tax_fix( tax_table(oldphyloseq) minlength = 4, unknowns = c("."), #Unknowns in taxon table are represented by "." sep = "", anon_unique = TRUE, suffix_rank = "current" ) And then make a new phyloseq object with the new taxon table. The automatic generated commands do not work. I am replacing the info with the proper object. Double-checked it.

Now, when I made a new phyloseq object with the said info, I tried to ord_explore(new_object) and tried to do an NMDS plot with genera selected. It apparently has a vendetta against a particular fungal taxon name:"GS11" "Problematic Genus values detected in tax_table: GS11_Genus" There are other taxons that begin with "GS", I am not sure if those are problematic to it as well. Not sure what I am supposed to do here. It recommends "yourData %>% tax_fix(unknowns = c("GS11_Genus"))", which again, doesn't work. And GS11 doesn't represent an unknown variable.

I have multiple OTUs with the same taxonomic name, is that the problem? I have no idea how to troubleshoot this.

EDIT: I tried going through different taxonomic levels. Since GS11 is an order, from class level it works.

EDIT2: I started from scratch and wanted to see what would happen with all OTUs, not just fungi. Now its having problems with genera that share the same name, but don't share other higher taxonomic names. Some OTUs have the same genus and family name, but order and class are different (some protists). GS11 however is all from the same higher taxonomic groups. Again, recommended command yields 0 results and I have to make a new taxonomic table and then a new phyloseq object.

EDIT3: So I guess I have to rename the ones that have the same name at lower taxon, but don't share names in higher taxons. Ok, fine. However I am still running into problems with GS11 taxon and some other GS taxons. They have every name same, but still.

Nekovirus commented 2 months ago

So I figured out the dealio with the problematic names regarding GS11. Apparently, when the OTU table was constructed for me, there was a problem with certain parts of the extended class name having an upper case letter in some OTUs and a lower case one in other OTUs. I fixed the tax table, however the tax.fix() command doesn't do anything. It prints out the fixed version, but doesn't actually change it if I look at the Phyloseq object's tax table.

david-barnett commented 2 months ago

In your last comment you said this:

the tax.fix() command doesn't do anything. It prints out the fixed version, but doesn't actually change it if I look at the Phyloseq object's tax table.

If I understand correctly, you did not assign the result of tax_fix to a new object, you just printed the result? e.g. something like this:

your_phyloseq %>% tax_fix()

There may be something more complicated going on, but first I want to double-check you know you typically need to assign the result of an R function to an object, like this:

your_phyloseq <- your_phyloseq %>% tax_fix()

or:

fixed_phyloseq <- your_phyloseq %>% tax_fix()
david-barnett commented 1 month ago

assuming this is resolved, feel free to open new issue if not :)