jldbc / pybaseball

Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)
MIT License
1.25k stars 333 forks source link

schedule_and_record doesn't work for any year/team combo #408

Closed jmjfisher closed 5 months ago

jmjfisher commented 6 months ago

http://www.baseball-reference.com/teams/SEA/2023-schedule-scores.shtml is a valid URL, but the error states:

Data cannot be retrieved for this team/year combo. Please verify that your team abbreviation is accurate and that the team existed during the season you are searching for.

I believe Baseball Reference has added more tables as it's trying to find table [0]: table = soup.find_all('table')[0]

mdstepha commented 5 months ago

2024 works great for me. Is it only for past years you are seeing this error?

jmjfisher commented 5 months ago

Just tried it again. Worked once for one team (SEA in 2024), but then when I tried a different team or looping through all teams, I got the same error. Maybe Baseball Reference is blocking me beyond the initial hit?

Highspeedhomer commented 5 months ago

Are you looping through all teams at once? I have found I have to go div by div with a 30 second break, and it works just fine.

year = 2024

AL WEST

mariners = pyball.schedule_and_record(year, 'SEA') athletics = pyball.schedule_and_record(year, 'OAK') rangers = pyball.schedule_and_record(year, 'TEX') angels = pyball.schedule_and_record(year, 'LAA') astros = pyball.schedule_and_record(year, 'HOU')

time.sleep(30)

AL CENTRAL

indians = pyball.schedule_and_record(year, 'CLE') white_sox = pyball.schedule_and_record(year, 'CHW') tigers = pyball.schedule_and_record(year, 'DET') twins = pyball.schedule_and_record(year, 'MIN') royals = pyball.schedule_and_record(year, 'KCR')

jmjfisher commented 5 months ago

Yep, the time.sleep(x) seemed to do the trick, thanks!