Closed olgabot closed 10 years ago
Hey, this looks really cool! I'm not going to get a chance to give it a closer look until tonight (busy day of actual obligations... :/), but definitely seems like something that would be great to have. More soon.
OK sorry for the delay. This looks like a cool function, and I agree it sounds like it would fit in nicely to seaborn. But I have to admit...I'm not sure I understand what it's showing? It's not a visualization I've run into before in my field. But those are exactly the kind of contributions I'd welcome!
I don't think heatmap
is the best name for this kind of plot, though? It looks like it's doing something much more complicated than just mapping values to colors in a matrix -- originally when I aw the title of this PR I expected it to be something different. I gather that's the name of the function in R, although I'm not sure "follow what R does" is the best policy in general for naming things :)
PS, Re nogrid spines: is there a way to set that by default? I couldn't find anything about it in the rcParams. Or do you mean just call despine() within all of the plotting functions?
This heatmap
is also called a clustergram
or clustered heatmap
. The way I use them is to look at gene expression or splicing across many genes (30-50k) and samples (100-200)
Currently, there's no way to control spines with rcParams so it would have to be calling despine
after all the plotting functions. That is, until I figure out how to add an rcparam :)
Ah I figured it was some kind of gene thing. What are the dendrogams showing? I guess I could just look at the code :)
I think an upstream PR for "Allow axis spines to be configurable" in matplotlib would curry lots of support.
In terms of names, for seaborn I'd like to to have every function that draws something be called <something>plot
(note issue #34 keeping track of the fact that I am going to rename violin
for consistency). clustergramplot
seems too wordy though. Hmm.
clusterplot
maybe?
The dendrograms are showing the (pairwise) hierarchical clustering of associations between genes, which is really useful for finding sub-modules of expression of certain genes. I look at these things all day :)
A major TODO for this is to implement optimal leaf ordering which correctly orders the dendrogram leaf nodes after clustering. I wouldn't consider this finished until this is done.
Hm very cool. I wonder if this kind of thing would be useful for fMRI data.
In terms of the algorithm to support it, is that something you'd imagine being wrapped in with the plotting interface? Or is it otherwise useful? When seaborn
was just a personal project the way I had kept things organized was
seaborn
moss
(things like the bootstrapping, correlation permutation, etc.)(these are actually part of a broader ecosystem of packages I use for my research)
I'm starting to think this makes less sense, though, because it's annoying to have a separate package with core functions that seaborn depends on, especially if people are going to be contributing things that rely on algorithms I don't fully understand. So maybe going forward something like seaborn.algorithms
would be better. Then again, I use. e.g., bootstrapping often outside the context of plotting, and it would be strange to be importing them from a plotting library.
Of course if you had plans to submit the algorithm to, e.g., scipy, this discussion could be punted for a while.
Chiming in (I'm a pandas user with a few contributions and a day-to-day bioinformatician). I would decouple the algorithm from plotting, because you can visualize heatmaps using dendrograms generated from different algorithms (e.g. you may want to use "plain" algorithms, do some bootstrap resampling....).
@olgabot Somehow that publication slipped under my radar. Would be very nice to have in, indeed.
I'm going to close this as we have a WIP PR open on heatmaps (#73)
Hello there,
I'm working on a heatmap PR for
pandas
(https://github.com/pydata/pandas/pull/5646) but it's been suggested that all visualizations be worked on in separate packages. Sinceseaborn
already supports pandas internally and does a lot of the "run some algorithm and then show me the result" kind of stuff (violinplot, kde fitting, linear fitting, etc)How do you think this fits with
seaborn
?Olga
PS I made
prettyplotlib
which is a small matplotlib wrapper and I'm down to merge efforts but only if thenogrid
also automaticallydespine
s the top and right axes. :)PPS I'm also working on a PR for
seaborn
to accept abw_method
kwarg forviolin
because I need narrower bandwidths for my research.PPPS THANK YOU for making
paper
/poster
/notebook
/talk
contexts. Seriously one of the best things ever.