joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
581 stars 187 forks source link

Extract the reordering of taxon from heatmap/ordination object #971

Open anbjork opened 6 years ago

anbjork commented 6 years ago

Hi Joey (and everyone else)!

I am making heat maps and would like to color the y axis labels according to my taxa to quickly get an overview of if the clustering put similar taxa (for example families) close to each other. (I am making heat maps with a lot of taxa.) However, I can not figure out how to do it. Problem is that the clustering/ordination shuffles the order of the taxa, and I cannot figure out how to track that reordering.

I have tried to use ordinate() and see if I can find the reordering somewhere in the ordination object, but I haven't found anything that looks right. I was hoping to find a simple vector of the reordering that was done (similar to what the order() function of base R for reordering data frames returns) somewhere, but I haven't so far.

I have not found anything about it in the tutorials or the documentation for ordinate or plot_heatmap either, but I'm not too experienced, so I'm thinking that I'm missing something obvious.

Could someone perhaps point me in the right direction, or suggest another way to accomplish my original problem with labeling the y axis tick labels acc to taxa?

anbjork commented 6 years ago

Hi again!

FWIW, I found something that I think work. In the plot_heatmap() function, if the taxa.order is not supplied as an argument, it seems that the order of the taxa is determined in the following paragraph:

if (is.null(taxa.order)) { specDF = NULL trash2 = try({ specDF <- scores(ps.ord, choices = c(1, 2), display = "species", physeq = physeq) }, silent = TRUE) if (inherits(trash2, "try-error")) { warning("Attempt to access ordination coordinates for feature/species/taxa/OTU ordering failed.\n", "Using default feature/species/taxa/OTU ordering.") } if (!is.null(specDF)) { taxa.order = taxa_names(physeq)[order(RadialTheta(specDF))] }

specDF seems to contain the 2D projected coordinates of the different taxa, and taxa.order is the taxa names in the same order as in the heatmap.

In the ordination object from the ordinate() function, the "species" field seems to contain exactly the same 2D coordinates as the specDF variable internal to plot_heatmap() . So, running

taxa.order = taxa_names(physeq)[order(RadialTheta(ordination_object$species))]

where ordination_object is the output of ordinate() would give you the taxa names in the same order as in the heatmap. To get just the ordering vector, do my_heatmap_order = order(RadialTheta(ordination_object$species))

The RadialTheta() function is not one of the exported functions of phyloseq, but you can get it here: https://rdrr.io/bioc/microbiome/src/R/neatsort.R It's named slightly differently, so after copying the function definition, you need to rename it.

The above has worked for my tests. However, a comment from someone that knows a bit more about phyloseq as to whether this is a good approach would be highly appreciated!

Also, maybe it would be nice to add the ordering vector to the ordination object, so that it is more easily accessible for users that want to do things similar to what I am doing?

Have a nice day!

slvrshot commented 5 years ago

Hi Joey (and everyone else)!

I am making heat maps and would like to color the y axis labels according to my taxa to quickly get an overview of if the clustering put similar taxa (for example families) close to each other. (I am making heat maps with a lot of taxa.) However, I can not figure out how to do it. Problem is that the clustering/ordination shuffles the order of the taxa, and I cannot figure out how to track that reordering.

I have tried to use ordinate() and see if I can find the reordering somewhere in the ordination object, but I haven't found anything that looks right. I was hoping to find a simple vector of the reordering that was done (similar to what the order() function of base R for reordering data frames returns) somewhere, but I haven't so far.

I have not found anything about it in the tutorials or the documentation for ordinate or plot_heatmap either, but I'm not too experienced, so I'm thinking that I'm missing something obvious.

Could someone perhaps point me in the right direction, or suggest another way to accomplish my original problem with labeling the y axis tick labels acc to taxa?

Can you show your complete code? I am having trouble following what exactly you did in your work around.