vega / altair

Declarative statistical visualization library for Python
https://altair-viz.github.io/
BSD 3-Clause "New" or "Revised" License
9.21k stars 783 forks source link

Reformatting ordinal date value #1951

Closed chankrista closed 4 years ago

chankrista commented 4 years ago

I am trying to reformat the axis for an ordinal date value to show month year as follows:

alt.Chart(by_district).mark_rect().encode(
    alt.X('month_year:O', title='Month', axis=alt.Axis(format='%b %Y')),
    alt.Y('DISTRICT', title=['Police District'], axis=alt.Axis(grid=False),
          sort=['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12',
                '13', '14', '15', '16', '17', '18', '19', '20', '21', '22',
                '23', '24', '25', '31', '41', '51']),
    alt.Color('isr_count', title=['Total ISRs per Month'])
).properties(
title={
  "text": ["Monthly ISRs by Police District 2016-2018"], 
  "subtitle": ["Total ISRs per month vary across police districts and over time.",
               "Source: Chicago Police Department Investigatory Stop Reports*"]
}
)

The above code does not output anything. When I remove the axis argument in alt.X as follows, my I do get output as shown in the image, but my dates are not formatted as I want them to be (Jan 2016, Feb 2016, etc.). Why might this be happening?

alt.Chart(by_district).mark_rect().encode(
    alt.X('month_year:O', title='Month'),
    alt.Y('DISTRICT', title=['Police District'], axis=alt.Axis(grid=False),
          sort=['1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12',
                '13', '14', '15', '16', '17', '18', '19', '20', '21', '22',
                '23', '24', '25', '31', '41', '51']),
    alt.Color('isr_count', title=['Total ISRs per Month'])
).properties(
title={
  "text": ["Monthly ISRs by Police District 2016-2018"], 
  "subtitle": ["Total ISRs per month vary across police districts and over time.",
               "Source: Chicago Police Department Investigatory Stop Reports*"]
}
)

image

jakevdp commented 4 years ago

Try using a time unit instead of an axis format string; e.g.

alt.X('yearmonth(month_year):O', title='Month')

I think the reason the format string didn't work is because you're specifying that your data is of ordinal type, not temporal type.

chankrista commented 4 years ago

Right, using temporal does make it work, but how can I prevent Altair from treating the variable as categorical rather than continuous in this case? When I use temporal, I get the following output. Thanks! image

jakevdp commented 4 years ago

If you use a timeUnit with an ordinal type as I suggested in https://github.com/altair-viz/altair/issues/1951#issuecomment-583104870, it should give you an ordinal axis with proper time formatting.

chankrista commented 4 years ago

I believe this is what I am doing in the code below (unless I am misunderstanding), but for some reason the output I get is complete empty.

by_district['month_year'] = by_district.month_year.astype('datetime64')
alt.Chart(by_district).mark_rect().encode(
    alt.X('month_year:O', title='Month', axis=alt.Axis(format="%b %Y")),
    alt.Y('DISTRICT', title='Police District'),
    alt.Color('isr_count', title='Total ISRs per Month')
).properties(
title={
  "text": ["Monthly ISRs by Police District 2016-2018"], 
  "subtitle": ["Total ISRs per month vary across police districts and over time.",
               "Source: Chicago Police Department Investigatory Stop Reports*"]
}
)

image

jakevdp commented 4 years ago

Temporal axis formats do not work with non-temporal values. You are specifying an ordinal encoding with a temporal axis format, which is why it does not work.

What you should do is ensure that your ordinal data is treated as temporal by specifying a time unit with your ordinal encoding.

More specifically, change your x encoding from

alt.X('month_year:O', title='Month', axis=alt.Axis(format="%b %Y"))

to

alt.X('yearmonth(month_year):O', title='Month')

I've left out the format in the latter because yearmonth timeunits have a default format similar to what you've specified.

If this does not work, please provide a complete example that reproduces the error, including an example dataset (otherwise I'm unable to run your code, and am left to guessing).

chankrista commented 4 years ago

Got it, thanks for the speedy assistance!