grunwaldlab / metacoder

Parsing, Manipulation, and Visualization of Metabarcoding/Taxonomic data
http://grunwaldlab.github.io/metacoder_documentation
Other
134 stars 28 forks source link

metacoder with MetagenomeSeq #335

Open acpaulo opened 2 years ago

acpaulo commented 2 years ago

Hi, I would like to use metacoder with the results from metagenomeSeq. Is that possible? I don't know where to look for comparisons (in metagenomeSeq). Do you have any idea where to start? Sorry for the blanket question but I would really like to combine the procedures.

thank you

zachary-foster commented 2 years ago

I have not used metagenomeSeq, but it is generally easy enough to combine tools. Can you give me an example of the data from a metagenomeSeq workflow at the point that you want to use it in metacoder? This can be a subset of your data or a point in an online example that I can replicate. If I can see an example of the format of the data you want to use in metacoder I should be able to suggest a way.

Giozanna commented 2 years ago

Hi, I'm interested in the same as well. I would like to use results from differential analysis with metagenomeseq (Fitzig model) in metacoder to use your visualization tool. Do you have any suggestion on how to do it? At the end dataset obtained is not different from Deseq so I guess there is this possibility.

Thank you

zachary-foster commented 2 years ago

Can you supply an example of the data you want to use with metacoder? You can use saveRDS to share the object with the data you want to plot. Without an example of the data you want to use, I cant give any advice without learning how to use metagenomeSeq, which unfortunately I don't have time to do right now. I am happy to help if you can provide an example, either a subset of your data or a link to example data with the same format.

Giozanna commented 2 years ago

Dear Zachary, sorry for being late.

Resuming... from the application of the Fitzig model I obtain a result table through the following script res <- topTable(fit3,coef=1,adjust="BH",n=Inf)

Which is organized as follow (just to make an example)

  logFC AveExpr t P.Value adj.P.Val
Bacteria_Actinobacteria_Actinobacteria_Corynebacteriales_Mycobacteriaceae_Mycobacterium -0.81254 4.385866 -1.88892 0.066302 0.469023
Bacteria_Actinobacteria_Actinobacteria_Glycomycetales_Glycomycetaceae_Glycomyces -0.61204 4.911639 -0.49065 0.626296 0.920174
Bacteria_Actinobacteria_Actinobacteria_Micrococcales_Microbacteriaceae_Agromyces -0.51106 5.429627 -1.39663 0.169216 0.678904
Bacteria_Actinobacteria_Actinobacteria_Micrococcales_Promicromonosporaceae_Promicromonospora -1.63045 7.350711 -1.61666 0.112923 0.599119
Bacteria_Actinobacteria_Actinobacteria_Propionibacteriales_Nocardioidaceae_Aeromicrobium -0.15033 3.155604 -0.18877 0.85152 0.995337

From my experience with metacoder, when I perform the Wilcox rank test before plotting I obtain the Log 2 ratio of median counts that is used for plotting in the tibble.

obj$data$diff_table <- compare_groups(obj, data = "tax_abund", cols = sample_data$sample, groups = sample_data$Species) obj <- mutate_obs(obj, "diff_table", wilcox_p_value = p.adjust(wilcox_p_value, method = "fdr"))

obj$data$diff_table$log2_median_ratio[obj$data$diff_table$wilcox_p_value > 0.05] <- 0

Then this for plotting

obj %>% heat_tree_matrix(data = "diff_table", node_label = taxon_names, node_size = n_obs, # number of OTUs node_color = log2_median_ratio, # difference between groups node_color_trans = "linear", node_color_interval = c(-3, 3), # symmetric interval edge_color_interval = c(-3, 3), # symmetric interval node_color_range = diverging_palette(), # diverging colors node_size_axis_label = "OTU count", node_color_axis_label = "Log 2 ratio of median counts", layout = "da", initial_layout = "re", key_size = 0.67)

So I was wondering how to use the log fold change obtained from metagenome seq instead. Do you think is possible?

Thank you again, Giovanna

zachary-foster commented 2 years ago

Thanks for the info!

Are you trying to make a single heat tree comparing two groups or a heat tree matrix for 3+ groups?

If its just two groups, it should be somewhat easy to parse that table and plot the data. If that taxonomy data was a column called "taxonomy" instead of row names, you could parse it with something like:

parse_tax_data(res, class_cols = "taxonomy", class_sep = "_")