vega / altair

Declarative statistical visualization library for Python
https://altair-viz.github.io/
BSD 3-Clause "New" or "Revised" License
9.23k stars 784 forks source link

`selection_point` fails schema validation #3581

Closed jpn-- closed 1 week ago

jpn-- commented 1 week ago

What happened?

I am trying to create an interactive figure that filters on data by iteration, but I am getting a SchemaValidationError:

SchemaValidationError: '2000' is an invalid value foriteration. Valid values are of type 'array'.

If I copy the published example from here, and run it, it works fine. But if I simply change the name of the data column "year" to "iteration", I can reproduce the failure.

import altair as alt
from vega_datasets import data
import pandas as pd

# read to pandas DataFrame 
df = pd.read_json(data.population.url)

# change column name "year" to "iteration"
df = df.rename(columns={"year": "iteration"})

# code below otherwise identical to example except changing "year" to "iteration"
select_year = alt.selection_point(
    name="Year",
    fields=["iteration"],
    bind=alt.binding_range(min=1900, max=2000, step=10, name="Year"),
    value={"iteration": 2000},
)

alt.Chart(df).mark_bar().encode(
    alt.X("sex:N").title('').axis(labels=False, ticks=False),
    alt.Y("people:Q").scale(domain=(0, 12000000)).title("Population"),
    alt.Color("sex:N")
        .scale(domain=("Male", "Female"), range=["steelblue", "salmon"])
        .title("Sex"),
    alt.Column("age:O").title("Age")
).properties(
    width=20,
    title="U.S. Population by Age and Sex"
).add_params(
    select_year
).transform_calculate(
    "sex", alt.expr.if_(alt.datum.sex == 1, "Male", "Female")
).transform_filter(
    select_year
).configure_facet(
    spacing=8
)

What would you like to happen instead?

The selection should work and not depend on the variable name literally being "year".

Which version of Altair are you using?

5.4.1

dangotbanned commented 1 week ago
# change column name "year" to "iteration"
# df.rename(columns={"year": "iteration"})
source = df.rename(columns={"year": "iteration"})

@jpn-- in your code block, alt.Chart references source - but that isn't the same as your renamed df.

Does this fix the validation error?

jpn-- commented 1 week ago

@dangotbanned Apologies, I had a bad copy-and-paste job into the issue. I've corrected the code above, the error still persists; and actually, I've found the error occur even before my (now fixed) bug. I can reproduce the SchemaValidationError with just this:

import altair as alt

select_year = alt.selection_point(
    name="Year",
    fields=["iteration"],
    bind=alt.binding_range(min=1900, max=2000, step=10, name="Year"),
    value={"iteration": 2000},
)
mattijn commented 1 week ago

Thanks for raising the issue! I can reproduce the error you are facing. The correct syntax that you should use in this situation is the following:

select_year = alt.selection_point(
    name="Year",
    fields=["iteration"],
    bind=alt.binding_range(min=1900, max=2000, step=10, name="Year"),
    value=2000
)

So instead of value={"iteration": 2000} you should use just value=2000. This is a change compare to the previous init argument. I'm not sure if this change is properly documented somewhere in the release notes when this was introduced.

mattijn commented 1 week ago

Ah, I see that you actually refer to this example in the documentation: https://altair-viz.github.io/gallery/us_population_over_time.html. That should be updated!

jpn-- commented 1 week ago

Thanks @mattijn. Now knowing the correct syntax, I've opened a PR to fix the docs in the places where the outdated syntax was used.