joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
584 stars 187 forks source link

Can't get taxa.order to work on custom ordering column #1006

Open jennformatics opened 6 years ago

jennformatics commented 6 years ago

Hi there, and first of all, thanks for phyloseq existing, and for the help you've already given to other people I know!

I'm trying to get my heatmap to sort in a custom order. My specific situation is that my taxon names are weird alphanumerics that don't sort "properly" by any automated means, so I've stuck an extra column into the tax_table that defines my desired sort order. Alas, taxa.order="desiredOrder" seems to do nothing. I'm aware that numbers are coerced to characters in the tax_table, so I'm not complaining that I'm getting an alpha sort instead of a numeric sort -- it's that I'm getting no change at all. (Oddly, sample.order works fine when I use a similar trick; perhaps it's significant that my custom ordering column in my sample_data table is allowed to be numeric.)

I've tried just dumping the data table from plot_heatmap, sorting it, and using the ggplot calls from the plot_heatmap code à la https://github.com/joey711/phyloseq/issues/971, but I'm too new to R to get that to work in a reasonable time frame.

Here's my MRE using GlobalPatterns, along with an image of the heatmap as it comes out: https://gist.github.com/jdrum00/6dfdf5e591111b33f720ce2d92990531

If taxa.order were working as I expect, the taxon ID numbers would be in numeric sort order along the y axis.

I enjoyed the discussion in https://github.com/joey711/phyloseq/issues/230, btw, and appreciate your decision to add the ordering features. Now I just need to get them to work! :)

joey711 commented 5 years ago

@jdrum00 Thanks for the MRE. Made this much faster to debug. If you read the function doc for the parameter taxa.order, you can see that it asks for either a taxonomy that you want to use, or a character vector:

... a character vector of taxa_names in the precise order that you want them displayed in the heatmap. This overrides any ordination ordering that might be done with the method/distance arguments...

So the answer in your MRE is to skip several of the data-manipulating steps, and change your plot_heatmap command to:

plot_heatmap(..., taxa.order = desiredOrder)

Cheers

jennformatics commented 5 years ago

Actually, that's not what I see in the docs for plot_heatmap. Under phyloseq 1.26.1, at least, I get this:

taxa.order: (Optional). Default ‘NULL’. Either a single character string matching one of the ‘rank_names’ in your data, or a character vector of ‘taxa_names’ in the precise order that you want them displayed in the heatmap. This overrides any ordination ordering that might be done with the ‘method’/‘distance’ arguments.

This implies that I can do, e.g.,

plot_heatmap(..., taxa.order="Genus")

...but that doesn't seem to work. Nor does generating an explicit character vector of the taxa_names in the desired order as "TaxOrder" and doing:

plot_heatmap(..., taxa.order=TaxOrder)

The y-axis of the heatmap doesn't change in either case.

joey711 commented 5 years ago

Please post a MRE of this latter TaxOrder approach not working so that I can understand what the problem is. Thanks!

slvrshot commented 5 years ago

Did you every figure out how to do this @jdrum00 or @joey711?

slvrshot commented 5 years ago

I just removed the NMDS and bray entries from my code and followed what @joey711 did in #425

taxaOrder = unique(taxa_names(physeq1))
plot_heatmap(physeq1, sample.order = sampleOrder, taxa.order = taxaOrder)

That worked and sorted everything based on their class but it changed my gene names back to their ids. I'll probably just manually add the names in powerpoint or photoshop.

EDIT: Upon checking the ids again....this DOES NOT WORK.