grunwaldlab / metacoder

Parsing, Manipulation, and Visualization of Metabarcoding/Taxonomic data
http://grunwaldlab.github.io/metacoder_documentation
Other
134 stars 28 forks source link

Customize sample order heat_tree_matrix? #323

Open LunavdL opened 2 years ago

LunavdL commented 2 years ago

Hi,

I am comparing abundance between 5 different salinity ranges (0-10, 10-15, 15-20, 20-25, and 25-36). I am making a heattree with heat_tree_matrix() as follows:

obj %>%
  metacoder::filter_taxa(taxon_ranks == "f", supertaxa = TRUE, reassign_obs = FALSE) %>%
  mutate_obs("cleaned_names", gsub(taxon_names, pattern = "\\[|\\]", replacement = "")) %>%
  metacoder::filter_taxa(grepl(cleaned_names, pattern = "^[a-zA-Z]+$"), reassign_obs = FALSE) %>%
  heat_tree_matrix(data = "diff_table",
                   node_label = cleaned_names,
                   node_size = n_obs, # number of OTUs
                   node_color = log2_median_ratio, # difference between groups
                   node_color_trans = "linear",
                   node_color_interval = c(-3, 3), # symmetric interval
                   edge_color_interval = c(-3, 3), # symmetric interval
                   node_color_range = diverging_palette(), # diverging colors
                   node_size_axis_label = "OTU count",
                   node_color_axis_label = "Log 2 ratio of median counts",
                   layout = "da", initial_layout = "re",
                   key_size = 0.67,
                   seed = 2)

And this is the output: image

To get a better insight into the gradient, however, I would like the sample order to be as follows: image

Is there a way to customize the order of the samples in the matrix? I already tried reordering the levels of the factor, but that didn't influence the matrix.

Best regards, Luna

zachary-foster commented 2 years ago

Sorry for the delay. This would be a good option to have. I might have a solution that uses the factor level order, but I need to test it more.

zachary-foster commented 2 years ago

I think I have a solution for you. I modified compare_groups to take into account the order of factors given to the groups argument. Install the dev version:

install.packages("devtools")
devtools::install_github("grunwaldlab/metacoder")

and try running the code again, this time using an ordered factor for your grouping variable.

Let me know if that works or not. Thanks!

LunavdL commented 2 years ago

Thank you for your reply! Unfortunately, it didn't work with my plots.

I had a look at the obj$data$diff_table however, and if I order the column "treatment_1", it does give me the order I want. It does not work when I use the following code: obj$data$diff_table$treatment_1 <- factor(obj$data$diff_table$treatment_1, levels = c("0_10", "10_15","15_20","20_25","25_36"))

But it does work if I do the following:

diff2 <- diff[order(diff$treatment_1),]
obj$data$diff_table <- diff2
zachary-foster commented 2 years ago

Thanks for the update. Hmm, it is not intended to work that way. It might not be returning the right result. I will look into that.

The changes I made are to the compare_groups function, so you would have to order the metadata used for the groups argument of compare_groups like so:

x <- parse_tax_data(hmp_otus, class_cols = "lineage", class_sep = ";",
                    class_key = c(tax_rank = "taxon_rank", tax_name = "taxon_name"),
                    class_regex = "^(.+)__(.+)$")

# Convert counts to proportions
x$data$otu_table <- calc_obs_props(x, data = "tax_data", cols = hmp_samples$sample_id)

# Get per-taxon counts
x$data$tax_table <- calc_taxon_abund(x, data = "otu_table", cols = hmp_samples$sample_id)

# Reorder metadata for plotting
hmp_samples <- dplyr::ungroup(hmp_samples)
fct_order <- c("Saliva", "Skin", "Stool", "Throat", "Nose")
hmp_samples$body_site <- factor(as.character(hmp_samples$body_site), 
                                levels = fct_order,
                                ordered = TRUE)

# Calculate difference between treatments
x$data$diff_table <- compare_groups(x, data = "tax_table",
                                    cols = hmp_samples$sample_id,
                                    groups = hmp_samples$body_site)

# Plot
heat_tree_matrix(x,
                 data = "diff_table",
                 node_size = n_obs,
                 node_label = taxon_names,
                 node_color = log2_median_ratio,
                 node_color_range = diverging_palette(),
                 node_color_trans = "linear",
                 node_color_interval = c(-3, 3),
                 edge_color_interval = c(-3, 3),
                 node_size_axis_label = "Number of OTUs",
                 node_color_axis_label = "Log2 ratio median proportions")
LunavdL commented 2 years ago

Thank you again! With levels=ft_order, customizing the sample order indeed works!

I compared the two graphs made with your adjusted code and what I did previously (diff2 <- diff[order(diff$treatment_1),]), and they are indeed different. The 4 graphs to the right are almost completely the same, but the other graphs differ.

With diff2 <- diff[order(diff$treatment_1),]: image

And with your new code: image

Thank you again for your fast reply. It has been a great help.