sportsdataverse / hoopR

An R package to quickly obtain clean and tidy men's basketball play by play data.
http://hoopr.sportsdataverse.org/
Other
80 stars 18 forks source link

Incorrect game_date when using hoopR::load_mbb_player_box #62

Closed aqsmith08 closed 1 year ago

aqsmith08 commented 2 years ago

Describe the bug The game_date seems to be one day off in some cases when I use the hoopR::load_mbb_player_box function.

To Reproduce

> library(dplyr)
> library(hoopR)
> 
> player.df <- hoopR::load_mbb_player_box(seasons = 2021:2022)
> 
> player.df %>%
+   filter(team_abbreviation == "FIU", game_id == 401373765) %>%
+   select(season, game_date, athlete_display_name, min, game_id) %>%
+   arrange(season, game_date, athlete_display_name)
   season  game_date athlete_display_name min   game_id
1    2022 2022-03-04          Aquan Smart  14 401373765
2    2022 2022-03-04         Clevon Brown  24 401373765
3    2022 2022-03-04       Daniel Parrish   2 401373765
4    2022 2022-03-04         Dante Wilcox   5 401373765
5    2022 2022-03-04         Denver Jones  28 401373765
6    2022 2022-03-04          Eric Lovett  28 401373765
7    2022 2022-03-04         Isaiah Banks  25 401373765
8    2022 2022-03-04     Javaunte Hawkins  23 401373765
9    2022 2022-03-04       Mohamed Sanogo  12 401373765
10   2022 2022-03-04     Petar Krivokapic  14 401373765
11   2022 2022-03-04         Seth Pinkney  17 401373765
12   2022 2022-03-04          Victor Hart   8 401373765

Expected behavior

Based on what I see in the KenPom boxscore, I expect the game_date to be 2022-03-03.

Screenshots

Screenshot from 2022-03-04 14-20-03

Additional context

Another example is game_id == 401372788 where I'd expect the game date to be 2022-02-26. Could it be that I'm loading the data and it defaults to a timezone that happens to move the game into a new day? I briefly played with lubridate but couldn't tell if this was the case. Thanks!

saiemgilani commented 2 years ago

Good note, much appreciated. That's come up elsewhere recently. It is related to me not converting the game datetime to Eastern time before parsing the date. Need to correct this uniformly across the packages.

aqsmith08 commented 1 year ago

Hey - With the NCAAB season here, I wanted to loop back on this issue. Is there anything we could do to fix some of the dates even if a uniform fix across the package may need to come later? I'd be willing to help but would want to coordinate with you all. Thanks.

saiemgilani commented 1 year ago

I just need to figure out the python equivalent of lubridate.

This one has been on my mind, ill fix it this week

aqsmith08 commented 1 year ago

Yo. Looping back here. At this point, there's no urgency given the season is already in full swing but I'll be keen to get this fixed come April.

aqsmith08 commented 1 year ago

Hey @saiemgilani - Alrighty, the season is over. Is it time to tackle this? I know that you mentioned figuring out the Python equivalent of lubridate. I'm decent in R. Is there a way I can help here? Don't mean to just throw this over to you. Appreciate everything you've created with this package and all the maintenance work that you do too!

saiemgilani commented 1 year ago

I did fix this just recently! see the game_date and game_date_time columns for the load_mbb_schedules() functions. I think I also input the same deal for the scoreboard/schedule functions on the development version

saiemgilani commented 1 year ago

oh, shoot, that's only for those schedule functions, i'll probably need to implement the same logic elsewhere I did in fact apply that logic to all the loader functions in the development version:

remotes::install_github("sportsdataverse/hoopR")

I can work on getting that in elsewhere it comes up as an issue. Feel free to open another issue for those

aqsmith08 commented 1 year ago

Ack. Amazing, thank you! I'll poke around on this and will open a new issue if I find anything else.