nflverse / nflfastR

A Set of Functions to Efficiently Scrape NFL Play by Play Data
https://www.nflfastr.com/
Other
425 stars 52 forks source link

[ISSUE] <target_share not calculated properly> #412

Closed nicholasmendoza22 closed 1 year ago

nicholasmendoza22 commented 1 year ago

Is there an existing issue for this?

Have you installed the latest development version of the package(s) in question?

What version of the package do you have?

4.5.1

Describe the bug

Hello, I noticed that when using the calculate_player_stats(pbp, weekly = FALSE) function, there seems to be a discrepancy in the target_share variable compared to when I manually calculate it. For example, Justin Jefferson's target_share using the function is 0.28268074. However, when I manually calculate it, I get 0.274. I obtained this calculation by doing 184 targets/672 Vikings' pass attempts. It appears that the function includes just the regular season stats, so I'm not sure what's causing this discrepancy. does anyone know why/what causes it?

Someone in the discord suspected that it was because weekly target shares were averaged rather than back-calculating the period-level team-attempts, so I was wondering if there was a way to fix this?

Reprex

test <- calculate_player_stats(pbp, weekly = FALSE)

## Creates a new dataframe but the target_share column is inaccurate (issue in bug description)

Expected Behavior

I expected target_share for a player to be equal to targets/total team passing attempts. Justin Jefferson's target_share using the function is 0.28268074. However, when I manually calculate it, I get 0.274. I obtained this calculation by doing 184 targets/672 Vikings' pass attempts.

Someone in the discord suspected that it was because weekly target shares were averaged rather than back-calculating the period-level team-attempts, so I was wondering if there was a way to fix this?

nflverse_sitrep

n/a

Screenshots

No response

Additional context

No response

mrcaseb commented 1 year ago

There is indeed a problem when we summarise the data for the option weekly = FALSE. It computes the mean of weekly target shares instead of sum(targets) / sum(team_targets).

However, your expected output is not correct anyways because you calculated sum(targets) / sum(team_attempts) which is not what target share is intended to do.

nicholasmendoza22 commented 1 year ago

Got it, thanks so much for clarifying this!