jimmyday12 / fitzroy_data

2 stars 1 forks source link

rescrape 2020 season - seems to fix issues related to 2020 data #4

Closed peteowen1 closed 2 months ago

peteowen1 commented 2 months ago

ran the weekly-scrape script but rescraped from 2020

fixes: https://github.com/jimmyday12/fitzRoy/issues/214

did this with the updated player ID functions

now numbers seem to align with afltables e.g. https://afltables.com/afl/stats/players.html#t3

afldata %>%
  janitor::clean_names() %>%
  group_by(id, first_name, surname) %>%
  summarise(handballs = sum(handballs)) %>%
  arrange(-handballs)

`summarise()` has grouped output by 'id', 'first_name'. You can override using the `.groups` argument.
# A tibble: 13,153 × 4
# Groups:   id, first_name [13,153]
      id first_name surname    handballs
   <dbl> <chr>      <chr>          <dbl>
 1  4182 Scott      Pendlebury      5053
 2 11538 Joel       Selwood         4399
 3 11583 Travis     Boak            4284
 4 11672 Josh       Kennedy         4203
 5  1105 Gary       Ablett          4201
 6  1135 Sam        Mitchell        4155
 7   418 Scott      West            4093
 8   930 Robert     Harvey          4008
 9 12055 Lachie     Neale           3948
10  4088 David      Mundy           3851
# ℹ 13,143 more rows
# ℹ Use `print(n = ...)` to see more rows
peteowen1 commented 2 months ago

working on fixing other things in the afltables data so there should be future pull requests here over the next week hopefully. But might as well merge the 2020 fixes first and go from there

peteowen1 commented 2 months ago

Gonna close this one in favour of total rescrape - managed to get that done quicker than expected