maksimhorowitz / nflscrapR

R Package for Scraping and Aggregating NFL Data
522 stars 139 forks source link

Rush Yds nto adding for season total (NFL.COM) #63

Open bziegler-stat6250 opened 6 years ago

bziegler-stat6250 commented 6 years ago

Hi All,

I'm trying to replicate player stats for a fantasy analysis. I'm using the SQL function in R to manipulate and query the data into smaller data sets by Season and by Player. When I query rushers for example, I sum(ydsGained) and sum(rushAttempts) my totals for 2016 don't matchup. I've compared them to NFL.com and Pro Football Reference.

Am I missing something? Any suggestions on how to replicate the data found elsewhere? My query is below:

SELECT Rusher, sum(case when PlayType = 'Run' then 1 else 0 end) as PlayTy, sum(RushAttempt) as Carries, sum(Yards_Gained) FROM NFL_DATA_2016 GROUP BY Rusher ORDER BY Carries DESC

My goal in the research and analysis is for a Fantasy Football Rebuild. I was hoping to replicate Year Stats for QB, RB, WR & TE for 2009-2017. I tried with the above SQL code in R and totals do not match NFL.com results for the year.

The project will run various regressions, classification and/or clustering algorithms against the data. Any insight would be awesome, even if it explains the discrepancies.

Thanks in advance,

chrisjkeeney commented 6 years ago

You may have to force the case of RushAttempt to be numeric. It automatically loads as a factor which will not sum correctly. Same with any other binary factors you want to group and summarize by.

Also you may have to filter by plays that were challenged (which I haven't found a work around for yet.