Closed tjburch closed 1 year ago
Hi @tjburch, if you can add more details on how to replicate this - maybe I can take a look into this.
also, I think we can improve the contributing.md and create a template for issues in order to make it more clear and concise for those who wants to collaborate to the project.
Thanks @BrayanMnz. @TheCleric is also starting to look when he has available time, so keep posted to this page.
Clone the repo, run pip install -e .
from top level, and then run pytest
and it should light up like a Christmas tree. You should be able to see which tests fail and what the error message is
I figured out at least some of the cases, those that touch bbref. Basically we're throwing a lot of requests their way and they get rate limited. Looking directly at the get_soup
output in standings.py
<h2 class="text-gray-600 leading-1.3 text-3xl lg:text-2xl font-light">You are being rate limited</h2>
</header>
<section class="w-240 lg:w-full mx-auto mb-8 lg:px-8">
<div class="w-1/2 md:w-full" id="what-happened-section">
<h2 class="text-3xl leading-tight font-normal mb-4 text-black-dark antialiased" data-translate="what_happened">What happened?</h2>
<p>The owner of this website (www.baseball-reference.com) has banned you temporarily from accessing this website.</p>
</div>
Not sure the best solution here. @TheCleric, any suggestions?
I figured out at least some of the cases, those that touch bbref. Basically we're throwing a lot of requests their way and they get rate limited. Looking directly at the
get_soup
output instandings.py
<h2 class="text-gray-600 leading-1.3 text-3xl lg:text-2xl font-light">You are being rate limited</h2> </header> <section class="w-240 lg:w-full mx-auto mb-8 lg:px-8"> <div class="w-1/2 md:w-full" id="what-happened-section"> <h2 class="text-3xl leading-tight font-normal mb-4 text-black-dark antialiased" data-translate="what_happened">What happened?</h2> <p>The owner of this website (www.baseball-reference.com) has banned you temporarily from accessing this website.</p> </div>
Not sure the best solution here. @TheCleric, any suggestions?
Hey!
So, I just ran into this error an you'll need to do a wait condition. Something below should work (it's in a jupyter notebook) as an example using time.sleep(10):
import numpy as np import pandas as pd import time import seaborn as sns import pybaseball as pyball import matplotlib.pyplot as plt import warnings warnings.filterwarnings('ignore')
from pybaseball import * from pybaseball import statcast, utils from pybaseball.plotting import plot_bb_profile
pd.set_option('display.max_columns', None) %matplotlib inline
def get_team_names(year):
NYY_df = schedule_and_record(year, 'NYY')
STL_df = schedule_and_record(year, 'STL')
BOS_df = schedule_and_record(year, 'BOS')
NYM_df = schedule_and_record(year, 'NYM')
NYY_df_teams = NYY_df.Opp.unique()
NYY_df_teams_list = list(NYY_df_teams)
STL_df_teams = STL_df.Opp.unique()
STL_df_teams_list = list(STL_df_teams)
BOS_df_teams = BOS_df.Opp.unique()
BOS_df_teams_list = list(BOS_df_teams)
NYM_df_teams = NYM_df.Opp.unique()
NYM_df_teams_list = list(NYM_df_teams)
AL_team = NYY_df_teams_list + BOS_df_teams_list
NL_team = STL_df_teams_list + NYM_df_teams_list
# Since not every team plays every other team, we get opponents from 2 seperate teams
# and weed out duplicates
all_team = AL_team + NL_team
mlb_teams = set(all_team)
return mlb_teams
def get_schedule_record_all_teams(year, team_names):
empt_team_schedule_list = []
for team in team_names:
print(team)
team_schedule = schedule_and_record(year, team)
time.sleep(10)
empt_team_schedule_list.append(team_schedule)
schedule_df = pd.concat(empt_team_schedule_list)
return schedule_df
def main(year):
team_names = get_team_names(year)
schedule_df = get_schedule_record_all_teams(year, team_names)
return team_names, schedule_df
team_names, schedule_df = main(2010)
Honestly, I just ran into this so my account's been temp banned as well, but I'm gonna grab my other computer and attempt with the wait condition.
The good news is #296 took care of the rate limits (thanks @TheCleric). The bad news is now the FG error in #315 is causing it's own failing tests (see: https://github.com/jldbc/pybaseball/actions/runs/4116782091/jobs/7107420647)
Closing per #318 and Bryan Peabody's great detective work.
We've got a bunch of tests that we need to address. Running locally I get:
test_amateur_draft.py
test_league_batting_stats.py
test_league_pitching_stats.py
test_playerid_lookup.py
test_standings.py
test_statcast_running.py
test_team_game_logs.py
test_team_results.py
Guessing it's just like fixtures breaking or something like that. Better to get resolved sooner than later.