jldbc / pybaseball

Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)
MIT License
1.23k stars 330 forks source link

statcast call bringing back empty dataFrame for games >= 8-23-2021 #233

Closed justin-wilke closed 3 years ago

justin-wilke commented 3 years ago

I'm getting an empty data frame using:

statcast(start_dt="2021-08-23", end_dt="2021-08-25")

also empty running: (running on 2021-08-25) statcast()

I DO get data when I run: statcast(start_dt="2021-08-22", end_dt="2021-08-22")

According to this page - https://baseballsavant.mlb.com/about#:~:text=It%20is%20updated%20daily%20at%203%20a.m.%20ET.&text=The%20Statcast%20Leaderboard%20is%20a,Hit%20Distance%20and%20Launch%20Angle.&text=Stacast%20Search%20is%20an%20application,search%20MLB.com's%20Statcast%20database.

The data should be updated at 3am ET. Not sure if it's a problem with this library - or the underlying data sources. I was able to do a statcast search on the savant webpage for games >= "2021-08-24"

I am running version 2.2.1

tjburch commented 3 years ago

Just ran this on a fresh install and was personally unable to replicate.

>>> from pybaseball import statcast
>>> len(statcast("2021-08-23","2021-08-25"))
6327
>>> statcast("2021-08-23","2021-08-25").game_date.unique()
array(['2021-08-24T00:00:00.000000000', '2021-08-23T00:00:00.000000000'],
      dtype='datetime64[ns]')

Is it possible you ran the query with the cache on before the data was available? That's the only thing that comes to my head immediately (granted, I'm less familiar with the caching behavior than the other devs here). You could try a query you haven't run and see if 2021-08-23 and 24 show up in the game_date (e.g. statcast("2021-08-21","2021-08-25").game_date.unique())

justin-wilke commented 3 years ago

You are right. Deleting records at ~/.pybaseball/cache solved my issue. Appreciate the help!