nflverse / nfl_data_py

Python code for working with NFL play by play data.
MIT License
252 stars 48 forks source link

Missing EPA data #41

Closed pphili closed 1 year ago

pphili commented 1 year ago

In week 6 of 2021, the Chicago Bears lost 24-14 to the Green Bay Packers: https://www.nfl.com/games/packers-at-bears-2021-reg-6. When trying to load the EPA data for this game, I get an empty dataframe. Could someone let me know if I am doing something wrong?

import nfl_data_py as nfl

seasons = [2021]
cols = ['epa', 'week', 'possession_team']

nfl.cache_pbp(seasons, downcast=True, alt_path=None)
data = nfl.import_pbp_data(seasons, cols, cache=True)

print(data.loc[(data['possession_team'] == 'CHI') & (data['week'] == 5)])
print(data.loc[(data['possession_team'] == 'CHI') & (data['week'] == 6)])
pphili commented 1 year ago

The first print statement gives me a bunch of data, while the second gives me the empty dataframe.

rplain1 commented 1 year ago

It appears that 2021_06_GB_CHI is missing from the participation data.

If you run the same script but look for posteam, which is from the play-by-play data, instead of possession_team you can see that there is data.

import nfl_data_py as nfl

seasons = [2021]

nfl.cache_pbp(seasons, downcast=True, alt_path=None)
data = nfl.import_pbp_data(seasons, cache=True)

print(data.loc[(data['posteam'] == 'CHI') & (data['week'] == 5)].shape)
print(data.loc[(data['posteam'] == 'CHI') & (data['week'] == 6)].shape)
alecglen commented 1 year ago

Confirming what @rplain1 said - it's just missing NGS data on the nflverse side, advise closing the issue.

image

pphili commented 1 year ago

Thanks for the info! This solves my problem!