LebeerLab / tidytacos

Functions to manipulate and visualize microbial community data
https://lebeerlab.github.io/tidytacos/
GNU General Public License v3.0
9 stars 1 forks source link

at least one of the taxonomic rank names should be present in the taxon table [BUG] #27

Closed slambrechts closed 3 months ago

slambrechts commented 4 months ago

When I run:

max_taxa <- 144
used_rank <- "class"
tidy_physeq %>%
  remove_empty_samples() %>%
  tidytacos::set_rank_names(
    rank_names = phyloseq::rank_names(physeq_18SP_no_singletons)
  ) %>%
  aggregate_taxa(rank = used_rank) %>%
  tidytacos::add_prevalence() %>%
  tidytacos::mutate_taxa(
    keep = min_rank(desc(occurrence)) < max_taxa
  ) %>%
  filter_taxa(
    keep,
    !is.na(class)
  ) %>%
  tidytacos::everything() %>%
  mutate(count = as.integer(count)) %>%
  select(taxon_id, sample_id, count, sample, Cmon_PlotID, Diepte,
         Landgebruik_MBAG, class, occurrence) %>%
  filter(
    complete.cases(.)
  )

I get:

Error in `aggregate_taxa()`:
! at least one of the taxonomic rank names should be present in the taxon table
Backtrace:
  1. ... %>% filter(complete.cases(.))
 20. tidytacos::aggregate_taxa(., rank = used_rank)
Execution halted

I think this might be related to the fact that we used the PR2 database for taxonomic assignment, which has an unusual taxonomic structure with 9 levels:

Domain / Supergroup / Division / Subdivision / Class / Order / Family / Genus / Species

Because when I use the above chunk of code for other primersets that were classified using a database with traditional taxonomic structure:

Phylum / Class / Order / Family / Genus / Species

I don't have this problem

In the help file of tidytacos::aggregate_taxa I read:

  • If the rank you are interested in is in the standard list, just supply it as an argument. * If not, delete all taxon variables except taxon_id and the ranks you are still interested in prior to calling this function

But I'm not sure what you mean with the standard list?

TheOafidian commented 4 months ago

Hey @slambrechts thanks for the detailed situation sketch of the bug!

We usually also mainly use the standard Phylum / Class / Order / Family / Genus / Species taxonomic structure, which is what the function refers to as 'standard list'.

At that part of the code, the aggregate_taxa function is normally checking if the rank_names variable of the tidytacos object, which contains a list of the taxonomic structure can be found in the taxa table. I see you've set that manually (which should do the trick).

Could you perchance have a look at those two to confirm that these are as we expect them to be?

tidy_physeq %>%
  remove_empty_samples() %>%
  tidytacos::set_rank_names(
    rank_names = phyloseq::rank_names(physeq_18SP_no_singletons)
  ) -> tidy_physeq_test

# check that these are the same (ie both lowercase) and contain "class"
rank_names(tidy_physeq_test)
colnames(tidy_physeq_test$taxa)

If there's no misalignment of the taxonomy labels there, could you provide me with a minimum reproducible example so I can look around and find out what goes wrong here? E.g a very small subset of this data, or generated data in the same structure.

slambrechts commented 3 months ago

Hi @TheOafidian thank you for your response!

I get:

> rank_names(tidy_physeq)
[1] "Domain"      "Supergroup"  "Division"    "Subdivision" "Class"      
[6] "Order"       "Family"      "Genus"       "Species"    
> colnames(tidy_physeq$taxa)
 [1] "taxon"       "taxon_id"    "domain"      "supergroup"  "division"   
 [6] "subdivision" "class"       "order"       "family"      "genus"      
[11] "species"  

Changing the code to the following seems to solve it:

tidytacos::set_rank_names(
    rank_names = tolower(phyloseq::rank_names(physeq_18SP_no_singletons)
  ))
TheOafidian commented 3 months ago

Hey Sam, glad to hear you've been able to find how to get around the issue using set_rank_names! The from_phyloseq function in tidytacos should also convert the phyloseq rank names to lowercase when you're adapting a phyloseq object to a tidytacos object!