Open mattijn opened 1 month ago
Thanks for opening this @mattijn
The reason I wanted an issue was to investigate why the definitions marked as # BUG
were not working.
I'm now seeing that this is the distinction between Expr
(str
) and ExprRef
({"expr": str}
).
I think handling this implicitly could be error prone.
A simple alternative could be adding Expression.to_expr
:
altair/expr/core.py
```diff diff --git a/altair/expr/core.py b/altair/expr/core.py index e7872ada..095385b4 100644 --- a/altair/expr/core.py +++ b/altair/expr/core.py @@ -168,21 +168,24 @@ class OperatorMixin: class Expression(OperatorMixin, SchemaBase): """ Expression. Base object for enabling build-up of Javascript expressions using a Python syntax. Calling ``repr(obj)`` will return a Javascript representation of the object and the operations it encodes. """ _schema = {"type": "string"} def to_dict(self, *args, **kwargs): return repr(self) + def to_expr(self): + return {"expr": repr(self)} + def __setattr__(self, attr, val) -> None: # We don't need the setattr magic defined in SchemaBase return object.__setattr__(self, attr, val) # item access def __getitem__(self, val): return GetItemExpression(self, val) ```
This would then allow the following syntax:
import altair as alt
from vega_datasets import data
source = data.barley()
base = alt.Chart(source).encode(
x=alt.X("sum(yield):Q").stack("zero"),
y=alt.Y("site:O").sort("-x"),
text=alt.Text("sum(yield):Q", format=".0f"),
)
# NOTE: Original `str` expressions
# tooltip = alt.expr("luminance(scale('color', datum.sum_yield))")
# color = alt.expr("luminance(scale('color', datum.sum_yield)) > 0.5 ? 'black' : 'white'")
luminance = alt.expr.luminance(alt.expr.scale("color", alt.datum.sum_yield))
# NOTE: More ergonomic? # <-------------------------------------------------
tooltip = luminance.to_expr()
color = alt.expr.if_(luminance > 0.5, "black", "white").to_expr()
bars = base.mark_bar(tooltip=tooltip).encode(color="sum(yield):Q")
text = base.mark_text(align="right", dx=-3, color=color)
bars + text
It would only be needed as the last operation, so it makes sense to be appearing as the final method call.
We could instead use (expr|ExprRef)
, this was just the quick and easy solution I saw
We have had .ref()
, but that was not necessary anymore at one moment, probably because we solved it implicitly. I think there is also .expr
. I used it once in this example. I wish we are clever enough to know when it should be an expression as string or an expression as reference. It's near impossible to know what should be used as an Altair user without using debug tools such as the Vega editor.
We have had
.ref()
, but that was not necessary anymore at one moment, probably because we solved it implicitly. I think there is also.expr
. I used it once in this example. I wish we are clever enough to know when it should be an expression as string or an expression as reference. It's near impossible to know what should be used as an Altair user without using debug tools such as the Vega editor.
Do you mean Parameter.ref()
@mattijn?
My suggestion was for Expression
(and subclasses), but not OperatorMixin
(which Parameter
derives)
Ah yeah, that was for parameter
. To speak for myself, I was happy that we could deprecate that one, since I was not able to explain when and when not to use it. The same feeling I have when we introduce an expression.to_expr()
..🤔
Ah yeah, that was for
parameter
. To speak for myself, I was happy that we could deprecate that one, since I was not able to explain when and when not to use it. The same feeling I have when we introduce anexpression.to_expr()
..🤔
@mattijn Well that part would be solved by communicating this in annotations 😃
It would also be quite similar to the recent changes with conditions in #3567
api.py
https://github.com/vega/altair/blob/cabf1e6428bab107a26adc6c51993d3b2c5bf6f0/altair/vegalite/v5/api.py#L628-L706
A lot of this is about identifying either a dict
with a key "condition"
or an object with an attribute object.condition
.
The same idea would apply for this case, but switching to "expr"
or object.expr
.
Since we know precisely what every schema will permit; I could look into how we can represent this scenario as part of SchemaInfo.to_type_repr()
?
It would be quite an achievement if you manage, but you might be right that it is actually possible.
@mattijn I'll keep this comment updated with all the context I can dig up.
tools.schemapi.utils.title_to_type_reprs
Had another look and I see this **TODO** ([line 555](https://github.com/vega/altair/blame/cabf1e6428bab107a26adc6c51993d3b2c5bf6f0/tools/schemapi/utils.py#L555)) I added in (#3536) has some relation https://github.com/vega/altair/blob/cabf1e6428bab107a26adc6c51993d3b2c5bf6f0/tools/schemapi/utils.py#L552-L570
vega-lite-schema.json
https://github.com/vega/altair/blob/cabf1e6428bab107a26adc6c51993d3b2c5bf6f0/altair/vegalite/v5/schema/vega-lite-schema.json#L8808-L8823 ```json { "definitions": { "Expr": { "type": "string" }, "ExprRef": { "additionalProperties": false, "properties": { "expr": { "description": "Vega expression (which can refer to Vega-Lite parameters).", "type": "string" } }, "required": [ "expr" ], "type": "object" } } } ```
altair.expr.core.py
https://github.com/vega/altair/blob/cabf1e6428bab107a26adc6c51993d3b2c5bf6f0/altair/expr/core.py#L31-L240
v5.api.param
Highlighting this since it is the last step for the *preferred* functional wrappers https://github.com/vega/altair/blob/cabf1e6428bab107a26adc6c51993d3b2c5bf6f0/altair/vegalite/v5/api.py#L1316-L1321 `core.VariableParameter.expr` **exists** https://github.com/vega/altair/blob/cabf1e6428bab107a26adc6c51993d3b2c5bf6f0/altair/vegalite/v5/api.py#L1381-L1387 `core.TopLevelSelectionParameter.expr` **doesn't** exist https://github.com/vega/altair/blob/cabf1e6428bab107a26adc6c51993d3b2c5bf6f0/altair/vegalite/v5/api.py#L1390-L1392 `core.SelectionParameter.expr` **doesn't** exist https://github.com/vega/altair/blob/cabf1e6428bab107a26adc6c51993d3b2c5bf6f0/altair/vegalite/v5/api.py#L1395-L1397 # Question Does this mean a defining trait of a `VariableParameter` is that it wraps an expression?
Parameter
as the annotation for ExprRef
.Expr
in generated code
str
Expression
(and subclasses) represent Expr
expr
return types (as of https://github.com/vega/altair/pull/3600#discussion_r1779468973)
alt.expr(...)
-> ExprRef
alt.expr.method(...)
-> Expression
alt.expr.property
-> Expression
@mattijn would you be okay if we renamed this issue?
Modifying the code used in #3614 example would be a benefit. However, I'm thinking more along the lines of https://github.com/vega/altair/issues/3563#issuecomment-2333887397 to provide a more ergonomic alternative.
My proposal of Expression.to_expr()
is an alternative, but based on https://github.com/vega/altair/issues/3616#issuecomment-2380614092 I'd like to explore if we could benefit from a larger change
Sure, issue can be renamed!
I've started exploring some of https://github.com/vega/altair/issues/3616#issuecomment-2380614092
What I found really surprised me.
I made this small change, which simply adds Expr
as an annotation when used in a schema.
This is to try to answer @mattijn's point in https://github.com/vega/altair/issues/3616#issuecomment-2379947174 about knowing when to use a string
tools/schemapi/utils.py
Running schema generation resulted in only 3 changes?
altair/vegalite/v5/schema/core.py
```diff diff --git a/altair/vegalite/v5/schema/core.py b/altair/vegalite/v5/schema/core.py index 623b27ee..1ff7c660 100644 --- a/altair/vegalite/v5/schema/core.py +++ b/altair/vegalite/v5/schema/core.py @@ -21921,7 +21921,9 @@ class DerivedStream(Stream): between: Optional[Sequence[SchemaBase | Map]] = Undefined, consume: Optional[bool] = Undefined, debounce: Optional[float] = Undefined, - filter: Optional[str | SchemaBase | Sequence[str | SchemaBase]] = Undefined, + filter: Optional[ + str | Expr | SchemaBase | Sequence[str | Expr | SchemaBase] + ] = Undefined, markname: Optional[str] = Undefined, marktype: Optional[SchemaBase | MarkType_T] = Undefined, throttle: Optional[float] = Undefined, @@ -21981,7 +21983,9 @@ class MergedStream(Stream): between: Optional[Sequence[SchemaBase | Map]] = Undefined, consume: Optional[bool] = Undefined, debounce: Optional[float] = Undefined, - filter: Optional[str | SchemaBase | Sequence[str | SchemaBase]] = Undefined, + filter: Optional[ + str | Expr | SchemaBase | Sequence[str | Expr | SchemaBase] + ] = Undefined, markname: Optional[str] = Undefined, marktype: Optional[SchemaBase | MarkType_T] = Undefined, throttle: Optional[float] = Undefined, @@ -26715,7 +26719,7 @@ class VariableParameter(TopLevelParameter): self, name: Optional[str | SchemaBase] = Undefined, bind: Optional[SchemaBase | Map] = Undefined, - expr: Optional[str | SchemaBase] = Undefined, + expr: Optional[str | Expr | SchemaBase] = Undefined, react: Optional[bool] = Undefined, value: Optional[Any] = Undefined, **kwds, ```
Making the same change, but for ExprRef
results in 5335 changes
I can't help but think that Expression
should be abstracting ExprRef
instead of Expr
.
This would mean these methods no longer both produce str
import altair as alt
from altair.expr.core import FunctionExpression
example: FunctionExpression = alt.expr.random()
>>> example.to_dict() == example._to_expr()
True
Here OperatorMixin.to_dict()
returns a str
, but could return {"expr": "..."}
if it subclassed ExprRef
instead.
It would still make sense for OperatorMixin._to_expr()
to return a str
- and that maps more closely to Expr
.
We'd then have the same type returned from all of:
expr(...)
expr.method(...)
expr.property
I think this is conceptually quite clean and easy to explain.
expr
is a factory to produce expr
objects (expressions).
Since that would account for 5335 annotations, we could use expr
in them directly.
That would also solve minor typing issue in expr.__new__
What is your suggestion?
As requested by @dangotbanned here: https://github.com/vega/altair/pull/3614#pullrequestreview-2333267989. https://github.com/vega/altair/pull/3614 uses an expression as a
str
with plain Vega Expression syntax. It would be nice to use the equivalent python syntax instead.Have you considered any alternative solutions?
No response