jimmyday12 / fitzRoy

A set of functions to easily access AFL data
https://jimmyday12.github.io/fitzRoy
Other
125 stars 27 forks source link

Incorrect hitout data from download #214

Closed liam-crow closed 2 weeks ago

liam-crow commented 2 months ago

When comparing hitout data from the cached dataset to the source from afltables.com I noticed quite a few inconsistencies

Please view the following tweet to see examples of incorrect data GMjYrtCaEAAoNnP

https://twitter.com/UselessStatsAFL/status/1785916078838956375

peteowen1 commented 4 weeks ago

Checked this and yeah it just seems like some data is wrong for whatever reason in the file: https://github.com/jimmyday12/fitzRoy_data/raw/main/data-raw/afl_tables_playerstats/afldata.rda

@jimmyday12 if you do a single run of the weekly-scrape-playerstats-afltables.R script but change it to scrape everything (not ideal if you were doing this every week but obviously fine for a one-off fix) then that fixes this issue.

Only bit of the weekly script you would need to change for this one-off would be

afldata_new <- fetch_player_stats_afltables(1897:end_year, rescrape = TRUE, rescrape_start_season = 1897)

After you run that script, the afldata file has hitout numbers matching the right screenshot as required.

jimmyday12 commented 2 weeks ago

https://github.com/jimmyday12/fitzRoy/pull/223