maksimhorowitz / nflscrapR

R Package for Scraping and Aggregating NFL Data
522 stars 139 forks source link

Trying to load league rosters and keep getting the same error #153

Open samhoppen opened 4 years ago

samhoppen commented 4 years ago

I've been trying to upload the rosters but keep running into the same issues. Below is the code I've been using:

library(tidyverse)
library(dplyr)
library(nflscrapR)
team_pbp_data <- purrr::map_dfr(c(2009:2019),
                                function(x) {
                                  readr::read_csv(paste0("https://raw.githubusercontent.com/ryurko/nflscrapR-data/master/play_by_play_data/regular_season/reg_pbp_",
                                                         x, ".csv")) %>%
                                    dplyr::mutate(pbp_season = x) %>%
                                    dplyr::select(posteam, pbp_season) %>%
                                    dplyr::filter(!is.na(posteam)) %>%
                                    dplyr::mutate(posteam = ifelse(pbp_season == 2016 & posteam == "JAC",
                                                                   "JAX", posteam)) %>%
                                    dplyr::distinct()
                                })

teams_2009 <- team_pbp_data %>%
  dplyr::filter(pbp_season == 2009) %>%
  dplyr::pull(posteam)

reg_season_09_rosters <- get_season_rosters(2009,
                                            teams = teams_2009,
                                            type = "reg")

I skipped the preseason as I'm only looking at the regular season data, but I've run this before without experiencing any issues. Here's the error I keep getting:

> reg_season_09_rosters <- get_season_rosters(2009,
+                                             teams = teams_2009,
+                                             type = "reg")
Extracting QUARTERBACK
Error in UseMethod("xml_find_all") : 
  no applicable method for 'xml_find_all' applied to an object of class "character"

Any thoughts on how to resolve? Not sure if there's a package I'm missing or other code that's blocking this from working, but I've started new projects with just this code and still run into the issue. Let me know. Thanks!

CarlosML27 commented 4 years ago

Hi there!

I had the same issue and started to investigate what could be the problem.

I'm a noob in R, but I believe that error is raised while getting the birthdates of players. I think there could be a problem while retrieving the nodes with rvest.

If you see the player_ids, you'll see you have the exact same problem but due to not using that data again, the error is not raised:

image image

@ryurko @dutta hope you can get something there!

CarlosML27 commented 4 years ago

Update:

It's not the retrieving but an NFL problem with some missing GSIS IDs out there. Here's an example with Jalan McClendon (missing) and Cam Newton (how it should be).

image

I guess the solution is skipping those players or giving them a default value...