ffverse / ffpros

Retrieves data from FantasyPros.com
http://ffpros.ffverse.com
Other
9 stars 2 forks source link

Add function to scrape historical stats #4

Open scottfrechette opened 2 years ago

scottfrechette commented 2 years ago

Have you considered adding a function to scrape historical stats for given sport and position? While getting historical fantasy points by player is helpful it also provides relevant stats driving those points, which could be useful for deeper insights such as % of points from TDs.

Here's a very crude example to show URL and output:

library(dplyr)
library(rvest)

df_stats <- read_html('https://www.fantasypros.com/nfl/stats/qb.php?year=2021&week=1&scoring=Standard&roster=consensus&range=week') %>%
  html_table(header = F) %>%
  .[[1]]

cols <- paste(as.character(df_stats[1,]),
              as.character(df_stats[2,]),
              sep = "_") %>%
  gsub("^_|MISC_", "", .)

df_stats %>%
  slice(-1, -2) %>%
  rename_with(~tolower(cols)) %>%
  type.convert(as.is = T) %>% 
  glimpse()
#> Rows: 120
#> Columns: 18
#> $ rank          <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1~
#> $ player        <chr> "Kyler Murray (ARI)", "Patrick Mahomes II (KC)", "Jared ~
#> $ passing_cmp   <int> 21, 27, 38, 14, 32, 27, 42, 18, 34, 20, 21, 28, 36, 22, ~
#> $ passing_att   <int> 32, 36, 57, 20, 50, 35, 58, 23, 56, 26, 33, 51, 49, 37, ~
#> $ passing_pct   <dbl> 65.6, 75.0, 66.7, 70.0, 64.0, 77.1, 72.4, 78.3, 60.7, 76~
#> $ passing_yds   <int> 289, 337, 338, 148, 379, 264, 403, 254, 435, 321, 291, 3~
#> $ `passing_y/a` <dbl> 9.0, 9.4, 5.9, 7.4, 7.6, 7.5, 6.9, 11.0, 7.8, 12.3, 8.8,~
#> $ passing_td    <int> 4, 3, 3, 5, 4, 3, 3, 4, 2, 3, 2, 3, 2, 1, 2, 2, 1, 2, 2,~
#> $ passing_int   <int> 1, 0, 1, 0, 2, 0, 1, 0, 1, 0, 0, 3, 0, 0, 0, 1, 0, 0, 0,~
#> $ passing_sacks <int> 2, 2, 3, 0, 0, 1, 1, 3, 3, 1, 1, 1, 3, 2, 2, 6, 1, 5, 3,~
#> $ rushing_att   <int> 5, 5, 3, 6, 0, 7, 4, 5, 4, 5, 4, 1, 0, 6, 3, 0, 5, 1, 4,~
#> $ rushing_yds   <int> 20, 18, 14, 37, 0, 62, 13, 9, 6, -5, 40, -2, 0, 27, 19, ~
#> $ rushing_td    <int> 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0,~
#> $ fl            <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1,~
#> $ g             <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
#> $ fpts          <dbl> 34.6, 33.3, 29.9, 29.6, 29.2, 28.8, 28.4, 27.1, 25.0, 24~
#> $ `fpts/g`      <dbl> 34.6, 33.3, 29.9, 29.6, 29.2, 28.8, 28.4, 27.1, 25.0, 24~
#> $ rost          <chr> "97.8%", "99.9%", "13.9%", "30.7%", "96.8%", "97.0%", "9~

Created on 2022-08-28 with reprex v2.0.2

tanho63 commented 2 years ago

I haven’t, mostly because I like ffscrapr’s ff_scoringhistory methodology better in most cases. I can put this on the backlog! (No word on when that’ll happen)

scottfrechette commented 2 years ago

The one drawback with ffscrapr is not having Yahoo league data because they don’t like sharing. I manually scrape their data and join to general data like this based on need.

tanho63 commented 2 years ago

Ah! Yahoo. Okay, yeah, that would explain it. I owe a PR review for ffscrapr first but can probably knock this out decently quickly. In meantime, can use nflreadr::load_player_stats and adjust the fantasy points column as necessary?