Open daniperaleda opened 11 months ago
yes, it outputs an error for me too
(TorchEnv) PS C:\Users\Анатолий\Documents\GitHub> & C:/Users/Анатолий/source/repos/PyTorchtest/PyTorchtest/TorchEnv/Scripts/python.exe c:/Users/Анатолий/Documents/GitHub/scrapeOP/FinalS craper.py C:\Users\Анатолий\Documents\GitHub Data will be saved in the following directory: C:\Users\Анатолий\Documents\GitHub Please indicate the format of tournament (3 sets or 5 sets) :
Please indicate the surface :
We start to scrape the following tournament : charleston-challenger-men
Traceback (most recent call last):
File "c:\Users\Анатолий\Documents\GitHub\scrapeOP\FinalScraper.py", line 14, in
I did edit this part of the code and then this error stoped.
def scrape_current_tournament_typeC(sport, tournament, country, SEASON, max_page = 25): global driver ############### NOW WE SEEK TO SCRAPE THE ODDS AND MATCH INFO################################ DATA_ALL = [] for page in range(1, max_page + 1): print('We start to scrape the page n°{}'.format(page)) try: driver.quit() # close all widows except: pass driver = webdriver.Chrome() data = scrape_page_typeC(page, sport, country, tournament, SEASON) DATA_ALL = DATA_ALL + [y for y in data if y != None] driver.close() data_df = pd.DataFrame(DATA_ALL) try: data_df.columns = ['TeamsRaw', 'Bookmaker', 'OddHome','OddDraw', 'OddAway', 'DateRaw' ,'ScoreRaw'] except: print('Function crashed, probable reason : no games scraped (empty season)') return(1) ##################### FINALLY WE CLEAN THE DATA AND SAVE IT ########################## '''Now we simply need to split team names, transform date, split score'''
data_df = data_df[~data_df['Bookmaker'].isnull()].dropna().reset_index()
data_df["TO_KEEP"] = 1
for i in range(len(data_df["TO_KEEP"])):
if len(re.split(':',data_df["ScoreRaw"][i]))<2 :
data_df["TO_KEEP"].iloc[i] = 0
data_df = data_df[data_df["TO_KEEP"] == 1]
# (a) Split team names
data_df["Home_id"] = [re.split(' - ',y)[0] for y in data_df["TeamsRaw"]]
data_df["Away_id"] = [re.split(' - ',y)[1] for y in data_df["TeamsRaw"]]
# (b) Transform date
data_df["Date"] = [re.split(', ',y)[1] for y in data_df["DateRaw"]]
# (c) Split score
data_df["Score_home"] = [re.split(':',y)[0][-2:] for y in data_df["ScoreRaw"]]
data_df["Score_away"] = [re.split(':',y)[1][:2] for y in data_df["ScoreRaw"]]
# (e) Set season column
data_df["Season"] = SEASON
# Finally we save results
if not os.path.exists('./{}_FULL'.format(tournament)):
os.makedirs('./{}_FULL'.format(tournament))
if not os.path.exists('./{}'.format(tournament)):
os.makedirs('./{}'.format(tournament))
data_df.to_csv('./{}_FULL/{}_{}_FULL.csv'.format(tournament,tournament, SEASON), sep=';', encoding='utf-8', index=False)
data_df[['Home_id', 'Away_id', 'Bookmaker', 'OddHome','OddDraw', 'OddAway', 'Date', 'Score_home', 'Score_away','Season']].to_csv('./{}/{}_{}.csv'.\
format(tournament,tournament, SEASON), sep=';', encoding='utf-8', index=False)
return(data_df)
I did also get an error at this part so i did just try to comment it out.
# Reject ads
# ffi2('//*[@id="onetrust-reject-all-handler"]')
# if switch_to_decimal:
# Change odds to decimal format
# driver.find_element("xpath", '//*[@id="user-header-oddsformat-expander"]').click()
# driver.find_element("xpath", '//*[@id="user-header-oddsformat"]/li[1]/a/span').click()
There is reject_ads in some of the def witch i also just did comment out.
Let me know if it helps you :)
Hi:
I have tried to use the code but don't know if it is still working for you.
I am facing problems to start using it.
The main error I get is "TypeError: WebDriver.init() got an unexpected keyword argument 'executable_path'"
It seems about the code but after google search I have not been able to fix it.
Thanks in advance