juba / pyobsplot

Observable Plot in Jupyter notebooks and Quarto documents
https://juba.github.io/pyobsplot/
MIT License
184 stars 8 forks source link

Added typst renderer to support png/pdf/jpg/svg exports #32

Closed wirhabenzeit closed 1 month ago

wirhabenzeit commented 2 months ago

This is a pull request addressing https://github.com/juba/pyobsplot/issues/23

Supported output formats are

Example

from pyobsplot import Obsplot, Plot
import polars as pl

penguins = pl.read_csv("https://github.com/juba/pyobsplot/raw/main/doc/data/penguins.csv")
op = Obsplot(renderer="typst", dpi=300, font="SF Pro Display", font_size=14, margin=4)

op({
    "color": {"legend": True},
    "marginLeft": 80,
    "marginRight": 80,
    "title": "Penguin body mass by island",
    "x": {"inset": 20},
    "grid": True,
    "marks": [
        Plot.boxX(penguins, {
            "x": "body_mass_g", "fill": "island", "y": "island", "fy": "species"
        })
    ]
})

results in typst

Writing to file

op(spec, path="file.png")

saves the output to a file

Issues

op = Obsplot(renderer="typst", dpi=300, font="SF Pro Display", font_size=14, margin=4)

over something like

op(spec, path="file.png", dpi=300, font="SF Pro Display", font_size=14, margin=4)
harrylojames commented 1 month ago

Really appreciate this pull request being put together. Worked almost flawlessly for me.

I've pulled it and have encountered a few issues - please see suggested changes.

wirhabenzeit commented 1 month ago

@harrylojames Regarding the tooltip: I think the issue is that the jsdom renderer does not display tooltips. As far as I can see this is due to the visibility="hidden" property of in

<g aria-label="tip">
<g fill="var(--plot-background)" stroke="currentColor" text-anchor="start" visibility="hidden" transform="translate(0,131)">
...
</g>
</g>
harrylojames commented 1 month ago

Currently this throws an error "RuntimeError: unknown format: html"

To avoid switching between renderers for the html file type would it make sense to include a line to render htmls @juba?

I assume it would just be duplicating this from the jsdom renderer?

with open(path, "w", encoding="utf-8") as f:
    f.write(str(res.data))
wirhabenzeit commented 1 month ago

@harrylojames How to reproduce this RuntimeError?

harrylojames commented 1 month ago

@wirhabenzeit

Apologies, if you set the extension to html I get this error. Are you able to reproduce with that?

penguins = pl.read_csv("https://github.com/juba/pyobsplot/raw/main/doc/data/penguins.csv")

op({
    "color": {"legend": True},
    "marginLeft": 80,
    "marginRight": 80,
    "title": "Penguin body mass by island",
    "x": {"inset": 20},
    "grid": True,
    "marks": [
        Plot.boxX(penguins, {
            "x": "body_mass_g", "fill": "island", "y": "island", "fy": "species"
        }),
    ]
},
path = "test.html")

I'm new to typst but it looks like compile doesn't recognise html as a format.

wirhabenzeit commented 1 month ago

@harrylojames The only allowed output formats are svg, pdf, png according to this:

>>> typst compile -h 
Compiles an input file into a supported output format

Usage: typst compile [OPTIONS] <INPUT> [SOURCE_DATE_EPOCH] [OUTPUT]

Arguments:
  <INPUT>              Path to input Typst file, use `-` to read input from stdin
  [SOURCE_DATE_EPOCH]  The document's creation date formatted as a UNIX timestamp [env: SOURCE_DATE_EPOCH=]
  [OUTPUT]             Path to output file (PDF, PNG, or SVG), use `-` to write output to stdout

I changed the pull request to raise a ValueError for other extensions.

juba commented 1 month ago

Many thanks for this PR, I'm starting (with quite a delay, sorry) to take a look at it.

There are some questions I'm still beginning to explore for now, such as if it is better to create another renderer, or integrate it as sort of output options for jsdom. And I wonder if what is made by manipulating the HTML with beautifulsoup could be done with a typst template.

wirhabenzeit commented 1 month ago

@juba Maybe having output options for jsdom is a better idea since the typst renderer is really just translating the jsdom output.

Regarding the typst template: You mean something like https://typst.app/docs/tutorial/making-a-template/? If so, then I am not sure it can be done easily. Surely the resulting typst code could be simplified a bit using this e.g. by defining a command for creating the legends. But fore extracting the info from the html it seems more convenient to do the parsing in python.

juba commented 1 month ago

@wirhabenzeit Yes, I was thinking about templates, and I've also seen that typst is able to load data from xml document:

https://typst.app/docs/reference/data-loading/xml/

This data import mechanism may be too basic to be suitable, but I would like to avoid adding dependencies if it is not necessary.

wirhabenzeit commented 1 month ago

@juba Indeed, this is possible. I changed the pull request accordingly. The typst logic is now in the src/pyobsplot/static/template.typ file

juba commented 1 month ago

@wirhabenzeit This is great ! I'm currently working on this, I'll try to release a test version very soon.

Thanks a lot, as I am not familiar with typst it would have taken me a lot of time to figure this out.

juba commented 1 month ago

I just merged a modified version of this PR. I integrated the TypstRenderer into the JsdomRenderer. A new format argument can be used either when creating the renderer object, or when generating a chart:

ot = Obsplot(renderer="jsdom", format="png")
# or
ot({...}, format="svg")

There is also a format_options argument, which is a dict with the following entries:

It is not possible to use "pdf" as format, but it can be used as file extension when using path:

ot({...}, path="/tmp/out.pdf")

I also modified the typst template in order to have a result as close as possible as the default Observable Plot output.

This is still very experimental, so any feedback is welcome.

Many thanks for your contributions on this very useful feature !

wirhabenzeit commented 1 month ago

@juba great, thanks for improving the code and merging!

Some quick testing:

juba commented 1 month ago

Thanks for your feedback !

  1. Your first point should now be fixed, thanks for pointing it out.
  2. I agree with your second point. When the jsdom renderer produces an HTML figure and the requested format is SVG, pyobsplot should now use typst to convert it. I just added a warning for the user to be aware of this conversion.
  3. Regarding the fonts, I took the values from here, and so I thought that San Francisco was installed on MacOS and had the necessary glyphs. I'm a bit reluctant to remove Noto Sans and Roboto because they are standard fonts on KDE Plasma and Android (I think). If I add Arial as last font in the list, do you think it would be ok ?
wirhabenzeit commented 1 month ago

@juba great!

Regarding the fonts:

The ideal solution would be to convince typst to use a fallback font just for the → symbol and not the whole label. Not sure how this can be done though

juba commented 1 month ago

You're right about Roboto and Noto Sans. The right arrow in Noto Sans seems to be available in Noto Sans Symbols and Noto Sans Math, which are installed by default on my system, but don't if just Noto Sans is installed from Google Fonts.

I added Lucida Grande and Arial at the end of the default font list. Let me know if you think the result is good enough.

Many thanks for your useful feedback.

juba commented 1 month ago

If you planned to test the new output formats in the days to come, I wanted to warn you that I just introduced two breaking changes in the development version.

First, the syntax with which you could define plots with kwargs is now deprecated, so it is no more possible to do something like:

Plot.plot(marks=[Plot.dot()], title="foo")

Second, and more importantly, I changed the plot generator API: generator objects are no longer created with a renderer="widget' or renderer="jsdom" argument, which didn't really make sense from the user point of view. Instead I replaced them with a format argument that can take the values "widget", "html", "svg" or "png".

In summary:

# The former
op = Obsplot(renderer="widget")
# Is replaced by
op = Obsplot(format="widget")
# And
op = Obsplot(renderer="jsdom")
# Is replaced by
op = Obsplot(format="html")
# And it is also possible to use
op = Obsplot(format="png")
op = Obsplot(format="svg")

Sorry for the breaking changes, but I believe it will make the API a bit clearer and more usable. I've updated the documentation accordingly. Any feedback welcome !

wirhabenzeit commented 1 month ago

@juba I think not exposing this jsdom renderer so prominently is a good idea. So the logic is that the format="" option specifies the inline display format, and the path="" allows to specify the export format? This does not quite apply since with format="widget" I cannot export e.g. a pdf file.

Actually, thinking about it, wouldn't it make more sense to specify the format on a per call basis, just like the path? What I mean is that the two main methods of operation could be

Plot.plot({}, format="widget")
Plot.plot({}, path="test.pdf")

or maybe even

Plot.plot({}, format="widget", path="test.pdf") 

for displaying and/or exporting? This would be conceptually simpler than exposing these custom renderers? A small drawback is that one has to respecify the format at every call (for non-default format) but this does not seem so problematic.

juba commented 1 month ago

In fact you can pass a format argument to Obsplot(), to an instance of Obsplot and to Plot.plot():

# Default format
op = Obsplot(format="png")
# Override default format
op({}, format="svg")
# Plot.plot
Plot.plot({}, format="html")

This way, you can both specify a default format, and override it for a specific plot if needed.

You can also add a path argument to an instance of Obsplot, or to Plot.plot():

op({}, path="out.png")
Plot.plot({}, path="out.png")

The path extension takes precedence over the format:

op = Obsplot(format="svg")
op({}, path="out.svg") # => export to SVG (with a warning)
op({}, path="out.png") # => export to PNG (with a warning)

Plot.plot({}, format="png", path="out.svg") # => export to SVG (with a warning)

If the format is "widget", only a path with an "html" extension is allowed, otherwise an error is raised.

Do you think it makes sense ?

wirhabenzeit commented 1 month ago

@juba Hmm, I tried but for me

Plot.plot({}, format="svg")

gives Plot.plot() got an unexpected keyword argument 'format'.

As for the general question, it seems a bit problematic that there is no one-to-one correspondence between format and path-extension. What I mean is that:

Without knowing the internals it seems weird that

Plot.plot({}, format="png", path="out.svg") => ok
Plot.plot({}, format="widget", path="out.svg") => error.

I first thought that it would be best to completely decouple format and path: format dictates the display format, path dictates the export format. But this does not account for the html case...

juba commented 2 weeks ago

I referenced your answer to continue the discussion in #37