grunwaldlab / metacoder

Parsing, Manipulation, and Visualization of Metabarcoding/Taxonomic data
http://grunwaldlab.github.io/metacoder_documentation
Other
135 stars 28 forks source link

Labeling just a determined rank #275

Closed ThiagoRBM closed 4 years ago

ThiagoRBM commented 4 years ago

Hello all,

Is it possible to make a heat tree where only some specific ranks are shown? I was messing around with "filter_taxa" but wasn't able to figure out how to make it work...

Thanks for your help!

zachary-foster commented 4 years ago

Sure is! See examples below and let me know if you have questions.

library(metacoder)
#> Loading required package: taxa
#> This is metacoder verison 0.3.3 (stable)
x <- parse_tax_data(hmp_otus, class_cols = "lineage", class_sep = ";",
                   class_key = c(tax_rank = "taxon_rank", tax_name = "taxon_name"),
                   class_regex = "^(.+)__(.+)$")
x %>% 
  heat_tree(node_label = taxon_names, node_size = n_obs, node_color = n_obs)

x %>% 
  filter_taxa(taxon_ranks == 'o', supertaxa = TRUE) %>%
  heat_tree(node_label = taxon_names, node_size = n_obs, node_color = n_obs)

x %>% 
  filter_taxa(taxon_ranks %in% c('r', 'p', 'f')) %>%
  heat_tree(node_label = taxon_names, node_size = n_obs, node_color = n_obs)

Created on 2019-11-08 by the reprex package (v0.3.0)

ThiagoRBM commented 4 years ago

Thanks!! But I'd like to keep all the nodes, but without labels. So I think the command to do so may be inside the "heat_tree" command itself. To be even more specific and using the examples you just gave, I'd like to, in the same heat tree, do the following:

  1. Keep all the labels until family for Proteobacteria,
  2. Keep just the nodes (without labels) for Bacteroidetes.

I tried using ifelse inside node_label but for some reason nothing happens and no error message is given.

Thanks again :)

zachary-foster commented 4 years ago

Ahh ok, that's a bit more complicated. how about this?

library(metacoder)
#> Loading required package: taxa
#> This is metacoder verison 0.3.3 (stable)
x <- parse_tax_data(hmp_otus, class_cols = "lineage", class_sep = ";",
                    class_key = c(tax_rank = "taxon_rank", tax_name = "taxon_name"),
                    class_regex = "^(.+)__(.+)$")

dont_label <- unlist(subtaxa(x)[taxon_names(x) == 'Proteobacteria'])
dont_label <- dont_label[taxon_ranks(x)[dont_label] == 'g']
dont_label <- c(dont_label, unlist(subtaxa(x, include_input = TRUE)[taxon_names(x) == 'Bacteroidetes']))

heat_tree(x,
          node_label = ifelse(taxon_indexes %in% dont_label, '', taxon_names),
          node_size = n_obs, 
          node_color = n_obs)

Created on 2019-11-09 by the reprex package (v0.3.0)

ThiagoRBM commented 4 years ago

Thanks!

Sorry for the questions, but in fact, since my tree is BIG, I'm still trying to make it more "palatable", so I'm trying many different configurations. Well, I used your example to create my tree, modifying some things. I think I'm gonna show some labels and hide some labels in a same taxonomic rank, i.e., show some families and hide other in the same order. But I think it is better to show the labels of higher tanks also. I couldn't figure out how to do this. Please, see the example below I made with your data. In this example, I'd like to show the labels for all the higher ranks for Bacteroidaceae: "Bacteroidales", "Bacteroidia", "Bacteroidetes", "Root" and so on for all the other higher ranks for the chosen labels. OBS: english is not my primary language, so, if my question isn't clear, sorry! Just ask me and I'll try to be clearer :)

I hope this is the last time I bother you with this hehehe...

tax= hmp_otus %>% 
  select(lineage) %>% 
  separate(lineage, sep=";", c("R", "Filo", "Classe", "Ordem", "Familia", "Genero")) 

ppp=  as.data.frame(lapply(tax, function(y) gsub("^.{0,3}", "", y)))

x <- parse_tax_data(hmp_otus, class_cols = "lineage", class_sep = ";",
                    class_key = c(tax_rank = "taxon_rank", tax_name = "taxon_name"),
                    class_regex = "^(.+)__(.+)$")

set.seed(10)
sample_frac(ppp[ppp$Classe=="Bacteroidia",], 0.1)

Bacteroidia= c("Bacteroidia")

dont_label.Bacteroidia <- unlist(subtaxa(x)[taxon_names(x) %in% Bacteroidia])
dont_label.Bacteroidia <- dont_label.Bacteroidia[taxon_ranks(x)[dont_label.Bacteroidia] == 'f']

set.seed(10)
sample_frac(ppp[ppp$Classe=="Clostridia",], 0.1)

Clostridiales.Fam= c("Clostridiaceae", "Veillonellaceae", "Ruminococcaceae", "Lachnospiraceae")

dont_label.Clostridiales <- unlist(subtaxa(x)[taxon_names(x) %in% Clostridiales.Fam])
dont_label.Clostridiales <- dont_label.Clostridiales[taxon_ranks(x)[dont_label.Clostridiales] == 'g']

unique(ppp[ppp$Ordem=="Actinomycetales",])

Actinomycetales= c("Propionibacteriaceae", "Micrococcaceae", "Mycobacteriaceae")

dont_label.Actinomycetales <- unlist(subtaxa(x)[taxon_names(x) %in% Actinomycetales])
dont_label.Actinomycetales <- dont_label.Actinomycetales[taxon_ranks(x)[dont_label.Actinomycetales] == 'g']

unique(ppp[ppp$Classe=="Gammaproteobacteria",])

Gammaproteobacteria= c("Pasteurellales", "Pseudomonadales", "Enterobacteriales")

dont_label.Gammaproteobacteria <- unlist(subtaxa(x)[taxon_names(x) %in% Gammaproteobacteria])
dont_label.Gammaproteobacteria <- dont_label.Actinomycetales[taxon_ranks(x)[dont_label.Gammaproteobacteria] == 'f']

set.seed(10)
sample_frac(ppp[ppp$Filo=="Proteobacteria",], 0.1)

Proteobacteria= c("Betaproteobacteria", "Alphaproteobacteria")

dont_label.Proteobacteria <- unlist(subtaxa(x)[taxon_names(x) %in% Proteobacteria])
dont_label.Proteobacteria <- dont_label.Proteobacteria[taxon_ranks(x)[dont_label.Proteobacteria] == 'o']

heat_tree(x,
          node_label = ifelse(!taxon_indexes %in% dont_label.Actinomycetales &
                              !taxon_indexes %in% dont_label.Clostridiales &
                              !taxon_indexes %in% dont_label.Bacteroidia &
                              !taxon_indexes %in% dont_label.Gammaproteobacteria &
                              !taxon_indexes %in% dont_label.Proteobacteria, '', taxon_names),
          node_size = n_obs, 
          node_color = n_obs)
zachary-foster commented 4 years ago

No problem. Is this the kind of thing you want?

library(metacoder)
#> Loading required package: taxa
#> This is metacoder verison 0.3.3 (stable)
x <- parse_tax_data(hmp_otus, class_cols = "lineage", class_sep = ";",
                    class_key = c(tax_rank = "taxon_rank", tax_name = "taxon_name"),
                    class_regex = "^(.+)__(.+)$")

taxa_to_label <- c('Bacteroidaceae', 'Actinomycetales', 'Streptococcus')
to_label <- unlist(supertaxa(x, include_input = TRUE)[taxon_names(x) %in% taxa_to_label])

heat_tree(x,
          node_label = ifelse(taxon_indexes %in% to_label, taxon_names, ''),
          node_size = n_obs, 
          node_color = n_obs)

Created on 2019-11-10 by the reprex package (v0.3.0)

ThiagoRBM commented 4 years ago

Great, I think this is exactly what I was looking for! Many thanks!

zachary-foster commented 4 years ago

No problem!