chihacknight / chn-ghost-buses

"Ghost buses" analysis project through Chi Hack Night
https://github.com/chihacknight/breakout-groups/issues/217
MIT License
19 stars 14 forks source link

[Data] Automatically check for CTA-observed holidays #38

Open lauriemerrell opened 1 year ago

lauriemerrell commented 1 year ago

Spinout from: https://github.com/chihacknight/chn-ghost-buses/pull/37#discussion_r1003846071

In compare_scheduled_and_rt.py we have a hard-coded list of holidays in a few places (ex: https://github.com/chihacknight/chn-ghost-buses/blob/main/data_analysis/compare_scheduled_and_rt.py#L99) and ideally that would be handled more automatically.

There is a holidays library in Python: https://github.com/dr-prodigy/python-holidays, which could help us. The thing is that we do not want to check for generic US (or even Chicago / Cook County) holidays, we only want to check for the specific holidays on which the CTA runs Sunday Service.

At time of posting, that is:

Our services operate on a Sunday schedule on New Year’s Day, Memorial Day, July 4th (Independence Day), Labor Day, Thanksgiving Day and Christmas Day.

adrianleh commented 1 year ago

Unfortunately, I was unable to set up your project to run locally due to the AWS usage but I wrote up a little script that can parse the holidays out of the CTA site with the holidays package you suggested. This currently assumes that holidays are correctly named but seems to do the job right now. One could also add some word likeness checking but that doesn't seem necessary as is. Let me know if this is of help!

https://gist.github.com/adrianleh/a5c532fd693cdf19791f5a0382405e6f

lauriemerrell commented 1 year ago

Thank you @adrianleh! We are on a bit of a hiatus this week for the holiday but I will take a look soon!