mwaskom / seaborn

Statistical data visualization in Python
https://seaborn.pydata.org
BSD 3-Clause "New" or "Revised" License
12.41k stars 1.91k forks source link

Standard seaborn.objects printouts are inaccessible in some ways on Macs #3373

Closed NickCH-K closed 1 year ago

NickCH-K commented 1 year ago

Alright, this one involves like four different pieces of software to isolate. But I think the issue here is in seaborn rather than one of those other places. Here's the issue:

  1. I have a Jupyter notebook containing two seaborn.objects graphs. The first one is printed using matplotlib (fig = plt.figure(), then so.Plot().on(fig). The second one is printed using seaborn.objects directly (without .on(fig)). These graphs render properly in Jupyter.
  2. I render that notebook to a document (HTML, Word, PDF, Powerpoint, any of them), using Quarto.
  3. The first graph renders properly in the resulting document. The second one does not appear.

Note:

  1. This issue occurs only on Mac. I have tested this on two Mac machine and two Windows machines. On both Windows machines, both graphs render properly. On both Macs, only the first renders and the second does not appear. I haven't tested Linux.
  2. This does not produce an error or anything, the graph simply does not appear in the resulting document.
  3. This issue does not occur with non-objects seaborn graphs. sns.lineplot() works fine.
  4. The matplotlib mode (inline or notebook) doesn't seem to matter.
  5. Versions: matplotlib 3.7.1, seaborn 0.12.2

I suspect there is something different in the way that seaborn.objects prints things as opposed to how matplotlib prints things (specifically on Mac I guess?) that is causing this, which is what makes me thing this is a seaborn.objects issue as opposed to, say, a Quarto issue.

Here is the code for a Jupyter notebook that exhibits the issue. Note that I am using p.plot() here, but the same issue occurs if you don't save the plot as p and instead just have so.Plot() on a line by itself.

(code chunk 1, this renders properly on both Windows and Mac)

import pandas as pd
import seaborn.objects as so 
import matplotlib.pyplot as plt

dat = pd.DataFrame({'a':[1,2],'b':[3,4]})

fig = plt.figure()

p = so.Plot(dat, x = 'a', y = 'b').on(fig).add(so.Dot())

p.plot()

(code chunk 2, this does not show up in the resulting document on Mac, but it works fine on Windows)


p = so.Plot(dat, x = 'a', y = 'b').add(so.Dot())

p.plot()
mwaskom commented 1 year ago

I suspect there is something different in the way that seaborn.objects prints things as opposed to how matplotlib prints things

This is indeed the case; Plot is rendered through IPython's rich display hooks, specifically see the code here for PNG output (there is also SVG output on master but not in 0.12.2).

Most of your fact pattern makes sense; lineplot and friends are just fully riding on top of the pyplot hooks, and Plot doesn't care at all about the %matplotlib magic mode.

specifically on Mac I guess

This part doesn't make any sense to me though. Seaborn is just returning a bytestream of the PNG data. It's up to IPython to display it / put it in the .ipynb file. That mechanism ends up being quite similar to the matplotlib inline hook, but not identical. E.g., you'll notice that you don't see the object identifier being printed in the REPL output with Plot, that's because the png is the object representation, whereas the inline backend inserts the png data through a separate process. But at the end of the day, they should both have base64-encoded PNG data in the json data structure. Not sure why that would vary on a Mac.

This is all fairly experimental so I'm open to it being an issue in the way seaborn is using the system, but if you can see the plot in the Jupyter interface but not in the Quarto output that makes me suspect it's an issue where Quarto is not properly grabbing rich outputs from all the places they could be. (Doesn't really explain why it works on Windows, but are you certain you have the same versions of Quarto and the relevant Jupyter libraries in both places?)

FWIW the seaborn docs are based on notebooks (using a custom nbconvert -> sphinx bridge) so I know at least some static export pathways handle them fine. Which again implicates Quarto in my mind.

FYI you can work around this if necessary by calling Plot.show(), which does hook into pyplot. Although then you won't get the nice HiDPI figures by default (you can still enable this through the matplotlib inline display config).

NickCH-K commented 1 year ago

Good to know! Thanks for taking a look at this. Given how you've described it it does sound like a Quarto issue (and very curious the Mac/Windows distinction). Thank you!