ryurko / nflscrapR-data

Data files (.csv) accessed with nflscrapR and summarized at the player-level
https://ryurko.github.io/nflscrapR-data/
373 stars 204 forks source link

Additional Roster Information #34

Closed mrcaseb closed 4 years ago

mrcaseb commented 4 years ago

Hi Ron,

I would like to suggest the addition of some roster information to your roster_data. I came to this need when I noticed that players have a profile id and an esb id in addition to the gsis id. The esb id is needed to build the url for headshots.

I have developed a solution which I show below. Since scraping the esb ids on my laptop took 0.7s on average for each profile this is no solution which should be run every time the ebs id is needed. That's why I ask you to add it to your csv data.

library(tidyverse)
library(jsonlite)
library(rvest)

roster_ron <-
  read_csv(
    "https://raw.githubusercontent.com/ryurko/nflscrapR-data/master/roster_data/regular_season/reg_roster_2019.csv"
  )

player_info_json <-
  bind_rows(lapply(
    fromJSON(
      "https://raw.githubusercontent.com/derek-adair/nflgame/master/nflgame/players.json"
    ),
    as.data.frame
  ))

roster_ron_new <-
  roster_ron %>%
  inner_join(
    player_info_json %>%
      select(birthdate, college, gsis_id, profile_id, profile_url),
    by = "gsis_id"
  )

roster_ron_new$esb_id <-
  sapply(roster_ron_new$profile_url,
         function(url) {
           url %>%
             as.character() %>%
             read_html() %>%
             html_nodes(xpath = '//meta[@id="playerId"]') %>%
             html_attr('content')
         })

roster_ron_new$headshot_url <-
  glue::glue(
    "http://static.nfl.com/static/content/public/static/img/fantasy/transparent/200x200/{roster_ron_new$esb_id}.png"
  )

Another possibility would be to add esb_id and headshot_url to all the data in player_info_json (that would probably take more than an hour and a half) and save it as a new csv.

mrcaseb commented 4 years ago

I found a much more efficient way to get roster information. It may be better to not use the code above. I'll contact you in the near future