Abundance heatmaps - Githubissues

antagomir commented 3 years ago

Can we add function to facilitate the plotting of abundance heatmaps. This is a need we often encounter in studies.

Starting point could be for instance existing, similar, plotting functionality from microbiome pkg (see below for an example). Or another source.

The heatmap would visualize abundance matrix (assay data) as a heatmap.

Some data scalings/transformations, and sample/taxon sorting could be provided (as is already done in plotAbundance).

Once the basic function is implemented, we can consider adding more advanced sorting schemes (neatsort, neatmap) as these are useful in the context of heatmap visualization (see Rajaram & Oono 2010).

An example with microbiome package:

library(microbiome)
data(atlas1006)
plot_composition(transform(transform(core(atlas1006, detection = 0.1, prevalence = 0.01), "clr"), "Z", target = "sample"), sample.sort = "Prevotella melaninogenica et rel.", otu.sort = "abundance", verbose = TRUE, plot.type = "heatmap") + coord_flip()

FelixErnst commented 3 years ago

Can you proved a working example? The code above results in an error/empty plot for me:

> plot_composition(transform(transform(core(atlas1006, detection = 0.1, prevalence = 0.01), "clr"), "Z", target = "sample"), sample.sort = "Prevotella melaninogenica et rel.", otu.sort = "abundance", verbose = TRUE, plot.type = "heatmap") + coord_flip()
Pick the abundance matrix taxa x samples
Average the samples by group
Sort samples
Construct the plots
Constructing the heatmap.
NULL
Warning messages:
1: In transform(x, "log10") : NaNs produced
2: In max(abs(df[[fill]])) :
  no non-missing arguments to max; returning -Inf
3: In heat(tmp, colnames(tmp)[[1]], colnames(tmp)[[2]], colnames(tmp)[[3]]) :
  Input data frame is empty.

General thoughts on plotting heat maps and getting the data:

Since a lot of packages are available for plotting heat map it might be good to define a internal way of preparing the necessary data. Maybe this is already available via meltAssay @microsud ?
The function could also be exported to allow the user to choose which heat map packages to use
It would also facilitate writing dedicate wrapper functions. However, I would investigated which package are most usable and don't increase the dependency tree massively just for one plotting function.

microsud commented 3 years ago

meltAssay is there for any plotting with ggplot2. Most heatmap pkgs use numeric matrix and then separately provided annotations. I agree we first need to choose which heatmap pkg to use.

antagomir commented 3 years ago

Did you test with the latest development version from github before running the example? For me this has worked but currently I have problems with mia installation and cannot test again.

The meltAssay could be helpful for data preparation, and data export functions will be valuable. I agree to be careful with dependencies - if suitable functions are available already for microbiome data, we can move this issue to OMA. Examining this will thoroughly will require some more effort.

FelixErnst commented 3 years ago

Nope, I tested with the release version

antagomir commented 3 years ago

I propose to start with ggplot2 for heatmap plotting. It is slow but we already have many functions and examples with phyloseq that can be readily translated to miaverse. Once we have heatmap visualizations in place in miaViz and OMA using this, it is always possible to explore better alternatives and replace material when we or someone else comes up with better stuff.

A main disadvantage in gglot is that it is somewhat slow especially for larger heatmaps.

I agree that it would be good to have wrappers that prepare the data so that it could be used by different heatmap packages.

FelixErnst commented 3 years ago

Heatmap plotting can be quite tricky, since it is rarly about the heatmap itself, but how the heatmap is structured by additional covariates.

From a design point-of-view, I would approach it from that angle: What covariates do I need to plot? If there are no grouping variables (conditions, replicates, etc), ggplot2 will suffice. With more complicated setups other packages will come into play, but I would have a look at OSCA first.

The notion of wrappers makes a lot of sense!

antagomir commented 3 years ago

OSCA seems to use basic tools such as plotHeatMap and pheatmap; also check this pheatmap example.

Will need to consider this still in a bit more detail..

antagomir commented 4 months ago

@TuomasBorman I guess we could close this one now?

TuomasBorman commented 4 months ago

Yes, we can close this.

microbiome / miaViz

Abundance heatmaps #14