vega / altair

Declarative statistical visualization library for Python
https://altair-viz.github.io/
BSD 3-Clause "New" or "Revised" License
9.35k stars 794 forks source link

Make it easier for downstream libraries to *safely* contribute themes #3586

Open dangotbanned opened 2 months ago

dangotbanned commented 2 months ago

What is your suggestion?

Originally posted by @dangotbanned in discussion w/ @MarcoGorelli

As I understand, the `alt.Chart.configure_` calls are being used to avoid registering + enabling a theme - which could override a user's custom theme. These work fine in isolation, but AFAIK would have issues if a user were to layer/concat/facet the result - since `config` is only valid at the top-level. You might want to add tests to see if [these ops](https://github.com/vega/altair/blob/df14929075b45233126f4cfe579c139e0b7f0559/tests/vegalite/v5/test_api.py#L332-L361) would still be possible - https://altair-viz.github.io/user_guide/customization.html#global-config - https://altair-viz.github.io/user_guide/configuration.html#top-level-chart-configuration Using a theme would have the benefit of deferring these config settings until the `Chart` is rendered - placing them in the [top-level only](https://github.com/vega/altair/blob/df14929075b45233126f4cfe579c139e0b7f0559/altair/vegalite/v5/api.py#L1837). --- It might be worth seeing if we can come to a good solution to this as part of https://github.com/vega/altair/issues/3519 since we have already [discussed](https://github.com/pola-rs/polars/pull/17995#issuecomment-2282795126) [issues](https://github.com/pola-rs/polars/pull/17995#issuecomment-2282802508) with the theme route

Problem

A library like polars may wish to provide a default theme, but not override a user-defined or user-enabled theme.

AFAIK, the "best" solution for this right now would be to override our "default" theme. However, this would be a destructive action and wouldn't scale well to multiple 3rd-parties each doing so:

Code block https://github.com/vega/altair/blob/df14929075b45233126f4cfe579c139e0b7f0559/altair/vegalite/v5/theme.py#L56-L74

Solution(s)

We could extend ThemeRegistry to support priority levels.

Either when registering/enabling a theme a level will be set corresponding to the party.

from enum import IntEnum

class ThemePriority(IntEnum):
    USER = 1
    THIRD_PARTY = 2
    DEFAULT = 3 # alternatives: `ALTAIR`, `STANDARD`, `BUILTIN`

For backwards-compatibility, this must default to ThemePriority.USER in any signatures the argument can be passed in from. All themes defined/registered in https://github.com/vega/altair/blob/df14929075b45233126f4cfe579c139e0b7f0559/altair/vegalite/v5/theme.py will be assigned ThemePriority.DEFAULT.

The semantics of which theme should be enabled for ThemePriority.(USER|DEFAULT) are quite simple.

The highest priority (lowest-valued) enabled theme is selected:

The basic resolution implementation for ThemePriority.THIRD_PARTY would be identical to the above. Simply a way for 3rd-parties to opt-in for a way to safely be used instead of the defaults - but not over user themes.

However, I think this behavior itself should be pluggable - to support alternative resolution semantics like:

Related

Note

Originally posted by @dangotbanned in https://github.com/vega/altair/issues/3519#issuecomment-2337790765

Splitting this into a separate issue for visibility

Have you considered any alternative solutions?

binste commented 1 month ago

The priority system is an interesting idea! I think it gets tricky once, as you mentioned, there are multiple third-party libraries. Also, if for example I'd import polars, I don't want as a side-effect to have all my other charts change their theme.

How about a combination of:

dangotbanned commented 1 month ago

unrelated to this, but it would help with making Altair thread-safe. Right now, you can get into race conditions (in multi-threading) if you're in a with alt.themes.enable(..) block as it modifies global state. I'll write down some thoughts on this in a separate issue soon

@binste thanks for the thoughts!

Will follow up with a proper (on topic) response, but this made me realise the link I gave didn't mention thread safety 🤦‍♂️ but it is there at the top of the page https://docs.python.org/3/library/queue.html#module-queue

I was thinking about this in relation to https://github.com/vega/altair/issues/3416#issue-2283326592

dangotbanned commented 1 month ago

The priority system is an interesting idea! I think it gets tricky once, as you mentioned, there are multiple third-party libraries. Also, if for example I'd import polars, I don't want as a side-effect to have all my other charts change their theme.

Definitely agree @binste, this is the exact situation I'm hoping we can avoid, when I said

However, I think this behavior itself should be pluggable - to support alternative resolution semantics like ...

  • User wants to opt-out of third-party contributions?

So maybe whatever the solution is, we make sure it is opt-in? Third-parties using this feature can always mention this step in their docs; which is simple enough


The ability to attach theme information directly to a chart.

I'm interested!

This configuration would need to be pushed to the top in layered and multi-view charts (see this comment for why)

This was my final link in the description! 😉

usermeta seems like the obvious place to me since all charts have it and it is ignored.

usermeta: Optional metadata that will be passed to Vega. This object is completely ignored by Vega and Vega-Lite and can be used for custom metadata.

I imagine storing the registered name/callback here - rather than the config (dict) itself - would make this a simpler operation. We could just defer a lot of the work/decisions until to_dict?

binste commented 1 month ago

I think we're on a good track here with adding the information to a chart, implementing a prioritization system, and making the final decision on which theme is used in to_dict. Proposal:

We could use usermeta instead of a new attribute and then remove again this information in to_dict but having a new attribute for this seems more explicit to me. Thoughts?

dangotbanned commented 1 month ago

Appreciate you taking the time to put this all together @binste

@binste

I think we're on a good track here with adding the information to a chart, implementing a prioritization system, and making the final decision on which theme is used in to_dict. Proposal:

  • A new theme attribute on a Chart which takes either:
    • registered name
    • callback function
    • a dictionary looking something like this:

@dangotbanned

I imagine storing the registered name/callback here - rather than the config (dict) itself - would make this a simpler operation.

I feel I've muddied things here with my spitballing.

This all becomes much simpler if we only used registered name to get the callable from the registry when needed; without adding a new place that a user can declare a theme.


So if we just set aside the Chart.(theme|metadata) for a moment ...

I was thinking more along the lines of setting this via with alt.theme.enable(): ...

I did link comment, but IMO the current behavior of this context manager is surprising.

Both uses here are effectively no-ops at the moment:

No theme enabled For the reasons described in https://github.com/pola-rs/polars/pull/17995#issuecomment-2282802508 ```py with alt.themes.enable('ggplot2'): return ( self.chart.mark_point() .encode(*args, **{**encodings, **kwargs}) .interactive() ) ``` ```py with alt.themes.enable('ggplot2'): chart = self.chart.mark_point().encode(*args, **{**encodings, **kwargs}).interactive() return chart ```

To me, the context manager seems like the more natural place for this - since we're setting global config.

ThemePriority def

Just adding here to save the scroll back to description ```py from enum import IntEnum class ThemePriority(IntEnum): USER = 1 THIRD_PARTY = 2 DEFAULT = 3 # alternatives: `ALTAIR`, `STANDARD`, `BUILTIN` ```

How polars could use

```py class DataFramePlot: """DataFrame.plot namespace.""" def __init__(self, df: DataFrame) -> None: import altair as alt with alt.themes.enable("polars", ThemePriority.THIRD_PARTY): self._chart = alt.Chart(df) def point( self, x: X | None = None, y: Y | None = None, color: Color | None = None, size: Size | None = None, /, **kwargs: Unpack[EncodeKwds], ) -> alt.Chart: ... return ( self.chart.mark_point() .encode(*args, **{**encodings, **kwargs}) .interactive() ) ```

The behavior when a theme is enabled, would be the same as what you described for setting via the constructor; but this would apply for all charts defined within the with:... block.


class ThemeInfo(TypedDict):
   theme: str | Callback[[], dict]
   priority: Literal[1, 2, 3]
  • If Chart.theme is a string or callback, it's always used.

  • If it's a ThemeInfo, the prioritization happens as you described it further above. Themes enabled with alt.themes.enable() always have prio 1, so a library such as Polars would use prio 2. default theme has prio 3 except if it was enabled specifically via alt.themes.enable. -> theme="my_theme" and theme={"theme": "my_theme", "priority": 1} should be equivalent

I'm not sure if we'd need all of this if we just set during alt.themes.enable() - maybe we'd handle some of the logic here first?

Screenshot

![image](https://github.com/user-attachments/assets/7c7fa122-669b-4a28-81fb-2157e8fc1f0a)


  • This attribute needs to be consolidated and lifted to the top level whenever a layered or multi-view chart is created. The existing implementations for other attributes of LayerChart, HConcatChart, etc. can serve as templates.

  • Once to_dict is called, it assigns the theme based on the prioritization.

Sounds good to me 👍

We could use usermeta instead of a new attribute and then remove again this information in to_dict but having a new attribute for this seems more explicit to me. Thoughts?

A new attribute would definitely be more explicit. The downside though (that I think you've hinted at?) is that we'd need to remove that attribute from the spec before validation.

If I've understood usermeta correctly - we could use a TypedDict here including the theme name and that could be rendered into the spec?

Not sure how many people are inspecting the resulting vega-lite, but having the name of the theme that was used seems helpful to persist to me


I'm happy with where this is going, seems like there are a few different ideas for us to test out!

Once #3536 is merged (to avoid conflicts) I'll open a draft PR so we can experiment further

dangotbanned commented 1 month ago

So I've just discovered sourcegraph as my usage of PEP 728 has shown up in https://discuss.python.org/t/pep-728-typeddict-with-typed-extra-items/45443/106

Very new to this but thought the usage of alt.themes here would be interesting to share: