mwshinn / CanD

6 stars 3 forks source link

CanD does not handle seaborn figure level plots #11

Closed jankaWIS closed 3 years ago

jankaWIS commented 3 years ago

I'm not sure if there is a simple way for you to fix it but I have noticed that CanD cannot handle seaborn's figure level plots (typically FacetGrid plots) like displot, relplot, catplot (see this part of seaborn docs) since you're effectively plotting on an axis if I understand Cand correctly and those are not axis-level plots. So for example:

from cand import Canvas, Vector, Point
import seaborn as sns

df = sns.load_dataset("planets")

# Create a canvas 
c = Canvas(6, 6, "inch")

# top left
c.add_axis("levels", Point(0.05, 0.05, "figure"), Point(0.45, 0.95, "figure"))
ax1 = c.ax("levels")
sns.displot(data=df, x="mass", hue="number", ax=ax1)

# Top right
c.add_axis("answers", Point(0.55, 0.05, "figure"), Point(0.95, 0.95, "figure"))
ax2 = c.ax("answers")
sns.displot(data=df, x="year", hue="method", legend=True, ax=ax2)

c.show()

will yield two seaborn warnings:

/Users/user/anaconda3/lib/python3.8/site-packages/seaborn/distributions.py:2163: UserWarning: `displot` is a figure-level function and does not accept the ax= paramter. You may wish to try histplot.
  warnings.warn(msg, UserWarning)

and yield a typical problem of empty places for axes with plots below: image

If one uses axes feature plots, like the recommended histplot, all works well:

from cand import Canvas, Vector, Point
import seaborn as sns

df = sns.load_dataset("planets")

# Create a canvas
c = Canvas(12, 6, "inch")

# top left
c.add_axis("levels", Point(0.05, 0.05, "figure"), Point(0.45, 0.95, "figure"))
ax1 = c.ax("levels")
sns.histplot(data=df, x="mass", hue="number", ax=ax1)

# Top right
c.add_axis("answers", Point(0.55, 0.05, "figure"), Point(0.95, 0.95, "figure"))
ax2 = c.ax("answers")
sns.histplot(data=df, x="year", hue="method", ax=ax2)

c.show()

image

Would there be a way to fix this? Although some of the seaborn plots are replacible (it's possible to go around this limitation), it is often quite a pain and it would be great if one could use your package to overcome the pain of having two FacetGrids next to each other. Thanks

mwshinn commented 3 years ago

Thanks for the bug report. The problem here is more of a conceptual one and I don't know if there is a good solution. In the figure-level plots, seaborn is effectively providing its own miniature version of CanD, implemented separately. It is laying out the position of different axes, the legend, etc. As a result, CanD can't access or manage this information. Seaborn has a slightly different guiding philosophy than CanD: seaborn prefers simplicity over fine control, whereas CanD favors fine control. As a result, and as you have probably noticed while using seaborn, the defaults are gorgeous, but if you want to make small adjustments, it can sometimes be difficult.

I think the best solution here is a to create a set of "CanD templates". This would include (among others) some of the defaults of seaborn. The documentation is gradually being written, and I think including templates in the documentation which reimplement the figure-level plots would be a great compromise. Do you think something like this would solve the problem?

jankaWIS commented 3 years ago

Hi @mwshinn, yes, I understand, I feared that (and hoped secretly that you would know a way how to go around this and somehow extract the details from seaborn before plotting).

Hm, I guess yes and no. I was thinking about that for a while for myself (making my own gallery and then glueing what I need). It is definitely possible and would be very nice to have, it could work as a solution. I see probably three problems -- a) there are many combinations and variations so potentially there would either be a lot of templates or still a lot of tweaking and a bit of mess, b) it would take a lot of time to gather and update everything, c) one would basically create/rewrite new seaborn which is in many aspects a bit stupid.

One thing I'm then not sure about -- are you proposing having or writing functions (eg like a displot) which you could call the same as seaborn just with CanD (that could work very nicely but see the problems mentioned above) or are you thinking of having a gallery "how to plot those plots with axes and not figure level plots" where you could just copy this code and place it in CanD? I think both are possible and it's more up to you what you think fits more your philosophy.

mwshinn commented 3 years ago

I can see either of your possible implementation strategies as working. In practice, it would probably be easiest to start with a gallery where you can copy and paste code, and it could move to actual functions like displot later. However, these plotting functions would probably go in a separate package, or possibly a sub-package. A few existing features of CanD (e.g. the reimplementation of matplotlib legends) could go there as well.

With respect to the problems you mentioned: regarding (a), seaborn figure-level plots are basically parameterizable templates, and so it should be straightforward to create functions to do this in CanD. Even if not creating a function, many features should be relatively easy for a user to change themselves based on an example template. Were there any particular things you had in mind? For (c), it would only be rewriting the layout in the figure-level pieces, which is relatively little of seaborn. Most of the effort in the figure-level functions goes to getting matplotlib to do the things that CanD already does, so implementing some of these in CanD could make them a lot simpler. Regarding (b), yes, it would be time consuming. Though it would be possible to do it in an incremental fashion as well, just adding stuff when you need it in your own work and letting it build over time. If you have any such templates or a gallery, I'd be happy to include them in the official documentation, giving you appropriate credit of course. That would be huge for CanD.

jankaWIS commented 3 years ago

Ok, that sounds like a plan. I do not have them anywhere except for my head at the moment, I have been now for at least the past half a year plotting anything and a lot has changed in seaborn since then. So at least I'd have a motivation to do it properly and not just a quick fix as I usually do. I think that in general FacetGrid is a very powerful tool in seaborn and the alternative in pure matplotlib which is gridspec isn't that nice or easy for what seaborn does. Besides that (ie all the ones I already mentioned), having pairplot could be very useful and possibly also a jointplot. That's another issue -- it partially works now (see below) but since it is composed, it splits the parts of the jointplot.

image

Regarding this:

it would only be rewriting the layout in the figure-level pieces, which is relatively little of seaborn. Most of the effort in the figure-level functions goes to getting matplotlib to do the things that CanD already does, so implementing some of these in CanD could make them a lot simpler.

What do you mean by this? How do you remove the top level and get just the underlying matplotlib without the figure-level pieces?

mwshinn commented 3 years ago

Sounds good. By the way, with respect to JointPlot, you might be interested in this, which implements a scatterplot with a histogram: https://cand.readthedocs.io/en/latest/gallery/fokkerplanck.html

it would only be rewriting the layout in the figure-level pieces, which is relatively little of seaborn. Most of the effort in the figure-level functions goes to getting matplotlib to do the things that CanD already does, so implementing some of these in CanD could make them a lot simpler.

What do you mean by this? How do you remove the top level and get just the underlying matplotlib without the figure-level pieces?

I mean that, for several parts of seaborn, the figure-level plots are tying together axis-level plots. The axis-level plots can be reused verbatim. The figure-level plots are in many cases formatting on top of the axis-level plots (e.g. in pairplot), a wider or more complete layout (e.g. displot), or changing the interface of matplotlib to be function arguments to a main plotting function instead of separate commands.

jankaWIS commented 3 years ago

Hi @mwshinn, I have done a first sketch of what we have discussed, rewrote the displot documentation of seaborn into axes level plots. Attached is the ntb exported into a pdf seabron_fig2ax.pdf -- for each example, there is the original figure and the axes level figure. There are probably other and maybe better ways how to do that, I tried to be as close as possible to the original problem. If this is something you would appreciate, I can try to do that for the other figure-level plots as well. There will probably be some problems and it won't be bulletproof in the sense that one still is going to need to adjust the code and play with it. It won't be like seaborn, like a fxn or a line you call and it does everything, but rather it should show how to go around and use axes-level tools only.

Sounds good. By the way, with respect to JointPlot, you might be interested in this, which implements a scatterplot with a histogram: https://cand.readthedocs.io/en/latest/gallery/fokkerplanck.html

That looks really cool, thanks for the tip. Yes, that we could use as well.

mwshinn commented 3 years ago

Oh wow, that looks fantastic! Yes, this type of thing would be a huge contribution. I would be happy to add this and any others you make to the official documentation if you would like.

One minor thing that would have to change in the code is using CanD for axis management instead of subplots and pyplot. So basically, just create a canvas and create two axes in it (or use add_grid) instead of using "subplots". You will also have to explicitly pass the axis object to the seaborn axis-level functions, i.e. ax=ax, instead of letting pyplot do it automatically. Doing it this way would allow you to get rid of the calls to set_bbox_to_anchor.

Also, for hiding spines, you might be interested in the sns.despine() function.

mwshinn commented 3 years ago

Since there is a PR fixing this issue I will close it.