rogers1000 / cyclingchaos

Cycling Data Package
6 stars 0 forks source link

Paralympics Data Calendar Addition #19

Closed rogers1000 closed 6 months ago

rogers1000 commented 8 months ago

Paralympics site looks really easy to ingest:

e.g. https://www.paralympic.org/london-2012/results/cycling

is: base website =https://www.paralympic.org/ + Paralympics name = London-2012 + results/cycling/ + race_event

rogers1000 commented 8 months ago

Did more scoping about what data is available.

Need to think about how it is going to link with first_cycling data.

There isn't really a unique ID in the same way for Paralympics website. Rider ID is the rider name and the Event ID is event + event_name

Category = Track race_nationality = 'Country ID` UCI_Race_Classification = Paralympics Classification Stage_Number = Qualification Round Stage Category = Event Type

rogers1000 commented 7 months ago

Transformation a bit more fun than expected.

No dates prior to 2012. Dates since 2012 are not in the same column each time.

rogers1000 commented 7 months ago

Need to do final transformation of some columns and then it will be ready to be unioned into the calendar master.

Won't have full metadata for Paralympic road races as need to locate distance and route. No idea what to do about Stage Profile either.

For Track, I need to confirm how UCI_race_classification works. Atm, I have the Paralympic classification in that field. However not sure if I should also include the category. Might be overthinking? Also not sure how to do distance. Might just be a manual df with event name in one column and distance in the other. Thinking route with amount of laps around the velodrome.

rogers1000 commented 7 months ago

Uploaded a good start to the output.

Still need to do:

Season and Stage Race Boolean are essentials that need sorting soon. The others require more thought and won't be immediately required.

rogers1000 commented 6 months ago

Season and Stage Race Boolean is now transformed. Need to union to First Cycling data and add to main code.

rogers1000 commented 6 months ago

Added to main calendar df, closing ticket for basic paralympics data into main calendar df.