vega / altair

Declarative statistical visualization library for Python
https://altair-viz.github.io/
BSD 3-Clause "New" or "Revised" License
9.39k stars 795 forks source link

docs: Adds example Calculate Residuals #3625

Closed dangotbanned closed 1 month ago

dangotbanned commented 1 month ago

Recreation of vega-lite example https://vega.github.io/vega-lite/examples/joinaggregate_residual_graph.html

I noticed the last example on Vega Theme Test didn't have an altair equivalent. This PR will also bring us one step closer to https://github.com/vega/altair/issues/3519#issuecomment-2359179155

Really like how concise the methods syntax version ended up

calculate_residuals

import altair as alt
from vega_datasets import data

imdb_rating = alt.datum["IMDB_Rating"]
source = data.movies.url

chart = (
    alt.Chart(source)
    .mark_point()
    .transform_filter(imdb_rating != None)
    .transform_filter(
        alt.FieldRangePredicate("Release_Date", [None, 2019], timeUnit="year")
    )
    .transform_joinaggregate(Average_Rating="mean(IMDB_Rating)")
    .transform_calculate(Rating_Delta=imdb_rating - alt.datum.Average_Rating)
    .encode(
        x=alt.X("Release_Date:T").title("Release Date"),
        y=alt.Y("Rating_Delta:Q").title("Rating Delta"),
        color=alt.Color("Rating_Delta:Q").title("Rating Delta").scale(domainMid=0),
    )
)
chart

Note

Wasn't able to use the url from vega_datasets.data.movies.url for this, unsure why exactly

Edit

Fixed in docs: Use vega_datasets instead of url

mattijn commented 1 month ago

Thanks! This will become even prettier once https://github.com/vega/altair/pull/3505 is in.

mattijn commented 1 month ago

oh btw, you have to use underscores instead of spaces when using vega_datasets:

import altair as alt
from vega_datasets import data

imdb_rating = alt.datum["IMDB_Rating"]

chart = (
    alt.Chart(data.movies.url)
    .mark_point()
    .transform_filter(imdb_rating != None)
    .transform_filter(
        alt.FieldRangePredicate("Release_Date", [None, 2019], timeUnit="year")
    )
    .transform_joinaggregate(AverageRating="mean(IMDB_Rating)")
    .transform_calculate(RatingDelta=imdb_rating - alt.datum.AverageRating)
    .encode(
        x="Release_Date:T",
        y=alt.Y("RatingDelta:Q").title("Rating Delta"),
        color=alt.Color("RatingDelta:Q").title("Rating Delta").scale(domainMid=0),
    )
)
chart

See https://github.com/vega/altair/issues/2213 and https://github.com/vega/altair/pull/2310

dangotbanned commented 1 month ago

oh btw, you have to use underscores instead of spaces when using vega_datasets:

🤦 Thanks @mattijn, I'll try that out tomorrow. Feel free to edit this if you can confirm that works before I get a chance to