Closed ajreinhard closed 4 years ago
The "LAR" thing as well as the date format should be dumb mistakes by me that I should be able to fix.
Will check height
as well.
The set of values in status depend on the data source which has changed in 2020 so this is unlikely to be fixed as we can't make the new source backwards compatible.
Thanks for noting this @ajreinhard!
I have transferred this issue to the roster repo to be able to track it better
I hope this won't break anybody's code but we will see if somebody complains.
The set of values in status
won't get unified as the pre and post 2020 status data are too different
whoops looks like I broke the birthdate completely. Will fix
Fixed birth_date type problem with 2e0eaa5572e2a8384b0206034069b80fc1b1cdf4
library(dplyr)
library(nflfastR)
fast_scraper_roster(2018:2020) %>%
filter(last_name == 'Goff') %>%
select(season, full_name, team, status, birth_date, height)
#> # A tibble: 3 x 6
#> season full_name team status birth_date height
#> <dbl> <chr> <chr> <chr> <date> <chr>
#> 1 2018 Jared Goff LA ACT 1994-10-14 6-4
#> 2 2019 Jared Goff LA ACT 1994-10-14 6-4
#> 3 2020 Jared Goff LA Active 1994-10-14 6-4
Created on 2020-11-05 by the reprex package (v0.3.0)
Maybe there is no easy solution for this without making the roster scraper function significantly more robust, but I've been have having issues with the 2020 format for some fields being different than prior years. The biggest issue is that the Rams abbreviation in the scrapper comes through as "LAR" rather than "LA" as it is treated across nflfastR. The other one that matters to me is
birth_date
, which is treated as MM/DD/YYYY pre-2020 and YYYY-MM-DD in 2020.height
is also in different format for 2020 andstatus
has a different set of values than prior years, but those two aren't as useful.An example of all four below: