toddrob99 / MLB-StatsAPI

Python wrapper for MLB Stats API
GNU General Public License v3.0
513 stars 98 forks source link

schedule function KeyError #96

Closed thesouthpawkc closed 1 year ago

thesouthpawkc commented 1 year ago

I am trying to pull the scores for every game in the live-ball era using statsapi.schedule. I had no issues for years 1920-1999. But I am getting a KeyError: 'score' when I try the year 2000. I saw another post that said this happens in 2011 also and changing the start date should help, but it is not working for 2000. (I can get the data for a small number of games in September and October, but the rest of the games cannot be accessed). Could you please help me figure this out? Thanks.

thesouthpawkc commented 1 year ago

So, it seems this issue happens for many years between 2000 and 2022. I am using the following code. It works when the limits where 1920-1999. Perhaps you nudge me in the right direction to fix it for 2000-2022. Thanks. image

toddrob99 commented 1 year ago

I looked into this for 2000 and the error is happening because there are games with no scores in the data. For example, there's a game on Sept 10, 2000, between the Chicago White Sox and the Cleveland Indians that also shows no data on the MLB website.

There are many games with no scores in 2000 and subsequent years. In 2000 they are all listed as Regular Season games, and in subsequent years they all appear to be Exhibition games.

2000 games for reference:

No score x['game_id']=5026 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-09'
No score x['game_id']=5184 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-11'
No score x['game_id']=4120 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-15'
No score x['game_id']=4207 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-15'
No score x['game_id']=3956 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-16'
No score x['game_id']=3306 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-17'
No score x['game_id']=5035 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-17'
No score x['game_id']=3308 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-17'
No score x['game_id']=3455 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-20'
No score x['game_id']=5045 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-21'
No score x['game_id']=4073 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-21'
No score x['game_id']=5048 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-22'
No score x['game_id']=3311 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-23'
No score x['game_id']=5003 x['status']='Final' x['game_type']='R' x['game_date']='2000-04-24'
No score x['game_id']=4096 x['status']='Final' x['game_type']='R' x['game_date']='2000-05-01'
No score x['game_id']=4106 x['status']='Final' x['game_type']='R' x['game_date']='2000-05-07'
No score x['game_id']=5158 x['status']='Final' x['game_type']='R' x['game_date']='2000-05-17'
No score x['game_id']=4059 x['status']='Final' x['game_type']='R' x['game_date']='2000-05-18'
No score x['game_id']=4123 x['status']='Final' x['game_type']='R' x['game_date']='2000-05-18'
No score x['game_id']=3521 x['status']='Final' x['game_type']='R' x['game_date']='2000-05-18'
No score x['game_id']=4127 x['status']='Final' x['game_type']='R' x['game_date']='2000-05-18'
No score x['game_id']=5202 x['status']='Final' x['game_type']='R' x['game_date']='2000-05-19'
No score x['game_id']=4238 x['status']='Final' x['game_type']='R' x['game_date']='2000-05-31'
No score x['game_id']=4635 x['status']='Final' x['game_type']='R' x['game_date']='2000-06-06'
No score x['game_id']=4735 x['status']='Final' x['game_type']='R' x['game_date']='2000-06-11'
No score x['game_id']=3541 x['status']='Final' x['game_type']='R' x['game_date']='2000-06-12'
No score x['game_id']=4362 x['status']='Final' x['game_type']='R' x['game_date']='2000-06-16'
No score x['game_id']=5583 x['status']='Final' x['game_type']='R' x['game_date']='2000-07-15'
No score x['game_id']=5597 x['status']='Final' x['game_type']='R' x['game_date']='2000-07-19'
No score x['game_id']=3675 x['status']='Final' x['game_type']='R' x['game_date']='2000-07-26'
No score x['game_id']=5546 x['status']='Final' x['game_type']='R' x['game_date']='2000-07-26'
No score x['game_id']=3741 x['status']='Final' x['game_type']='R' x['game_date']='2000-07-28'
No score x['game_id']=3727 x['status']='Final' x['game_type']='R' x['game_date']='2000-09-10'
No score x['game_id']=3803 x['status']='Final' x['game_type']='R' x['game_date']='2000-09-17'
No score x['game_id']=4543 x['status']='Final' x['game_type']='R' x['game_date']='2000-09-22'

I will add protection against missing score in the schedule method.

I also noticed there are some games without team names, so I'm protecting against that too. For example:

No team name x['game_id']=238429 x['status']='Final' x['game_type']='E' x['game_date']='2008-03-02' x['away_name']='???' @ x['home_name']='???'

Neither of these issues appeared in 2019 to present.

toddrob99 commented 1 year ago

fixed in v1.6

python3 -m pip install --upgrade mlb-statsapi

thesouthpawkc commented 1 year ago

Thanks Todd! I appreciate the quick turnaround on the fix. It seems to be working fine now.

When I was pulling the data in a for loop, it errored out at the year '2020'. But I tried to pull 2020 separately and it worked. Here is the error message in case you are interested.

image

toddrob99 commented 1 year ago

Thanks Todd! I appreciate the quick turnaround on the fix. It seems to be working fine now.

When I was pulling the data in a for loop, it errored out at the year '2020'. But I tried to pull 2020 separately and it worked. Here is the error message in case you are interested.

image

I saw the same behavior on random years when I was testing. I would suggest adding retry logic, sleeping for a few seconds between years, or both.