mwaskom / seaborn

Statistical data visualization in Python
https://seaborn.pydata.org
BSD 3-Clause "New" or "Revised" License
12.5k stars 1.92k forks source link

Refactor row_colors/col_colors from clustermap to heatmap #366

Closed FedericoV closed 3 years ago

FedericoV commented 9 years ago

Hi,

Not a huge deal, but row_colors/col_colors are generally useful when plotting a heatmap. They are also undocumented in the API of clustermap:

http://web.stanford.edu/~mwaskom/software/seaborn/generated/seaborn.clustermap.html

I can work on a PR if @olgabot is ok with it, since it was her (awesome) work originally.

olgabot commented 9 years ago

Hi Federico! Thanks for the kind words. I agree it would be awesome to have the row/column colors when plotting a heatmap, however the way the code is structured right now, heatmap() works on a single ax object, whereas the row/column colors add more ax to the side, kind of like jointplot. So I'm not sure if heatmap is the right place for it, or if clustermap should be renamed or refactored so you can easily work without dendrograms.

FedericoV commented 9 years ago

Ah - I see the point. It's not really a big deal, it's easy enough to turn off the dendogram and use clustermap as a 'heatmap+'.

On Mon, Nov 17, 2014 at 6:56 PM, Olga Botvinnik notifications@github.com wrote:

Hi Federico! Thanks for the kind words. I agree it would be awesome to have the row/column colors when plotting a heatmap, however the way the code is structured right now, heatmap() works on a single ax object, whereas the row/column colors add more ax to the side, kind of like jointplot. So I'm not sure if heatmap is the right place for it, or if clustermap should be renamed or refactored so you can easily work without dendrograms.

— Reply to this email directly or view it on GitHub https://github.com/mwaskom/seaborn/issues/366#issuecomment-63345778.

mwaskom commented 9 years ago

Olga's reaction is correct, although because everything would be pcolormesh based this could in principle all be done within the same Axes. So it might work

However as the side colors currently stand I'm a little dissatisfied with needing to provide explicit colors instead of data and have colors chosen using a palette. It's been a bit of a pain when using it. So I think it would be best to sort that out before adding it to the heatmap API.

If something like this does end up happening I'd want clustermap to be reworked to use it rather than have duplicated functionality in multiple places.

FedericoV commented 9 years ago

I thought that too! In theory, it might be really nice to have the row/column colors reflect possible clusterings from cutting the dendogram, the problem is that there are so many criterias for cutting a dendogram, and, as with many unsupervised approaches, it's so difficult to pick one a priori.

On Fri, Nov 21, 2014 at 3:42 AM, Michael Waskom notifications@github.com wrote:

Olga's reaction is correct, although because everything would be pcolormesh based this could in principle all be done within the same Axes. So it might work

However as the side colors currently stand I'm a little dissatisfied with needing to provide explicit colors instead of data and have colors chosen using a palette. It's been a bit of a pain when using it. So I think it would be best to sort that out before adding it to the heatmap API.

If something like this does end up happening I'd want clustermap to be reworked to use it rather than have duplicated functionality in multiple places.

— Reply to this email directly or view it on GitHub https://github.com/mwaskom/seaborn/issues/366#issuecomment-63916747.

mwaskom commented 9 years ago

I don't mean infer side colors from the data in the heatmap, I mean to do it from another source of semantic information. Like in this example, instead of having to take the "networks" level in the dataframe and map it to colors, just pass that information (and maybe the name of a palette) and the color mapping would be done behind the scenes.

FedericoV commented 9 years ago

A little feedback - the latest version (fresh from github) has broken plotting row/column colors, when hierarchical clustering is not enabled.

This seems to be because the plotting function:

if self.row_colors is not None: matrix, cmap = self.color_list_to_matrix_and_cmap( self.row_colors, self.dendrogram_row.reordered_ind, axis=0) heatmap(matrix, cmap=cmap, cbar=False, ax=self.ax_row_colors, xticklabels=False, yticklabels=False, **kws)

Wants an object self.dendrogram_row.reordered_ind which is not present if no dendogram is supplied.

olgabot commented 9 years ago

Crap, I'll get to this right away.

On Mon, Nov 24, 2014, 10:25 Federico Vaggi notifications@github.com wrote:

A little feedback - the latest version (fresh from github) has broken plotting row/column colors, when hierarchical clustering is not enabled.

This seems to be because the plotting function:

if self.row_colors is not None: matrix, cmap = self.color_list_to_matrix_and_cmap( self.row_colors, self.dendrogram_row.reordered_ind, axis=0) heatmap(matrix, cmap=cmap, cbar=False, ax=self.ax_row_colors, xticklabels=False, yticklabels=False, **kws)

Wants an object self.dendrogram_row.reordered_ind which is not present if no dendogram is supplied.

— Reply to this email directly or view it on GitHub https://github.com/mwaskom/seaborn/issues/366#issuecomment-64209205.

mwaskom commented 3 years ago

Doing some tidying and going to close this as I don't think it's going to get added to heatmap, but it can be accomplished through creative uses of clustermap or JointGrid + heatmap.