nickpadd / EuropeanFootballLeaguePredictor

A machine learning/statistical model to derive prediction probabilities for football matches of the top european leagues.
https://nickpadd.github.io/EuropeanFootballLeaguePredictor/Home.html
MIT License
36 stars 8 forks source link

bookmakers #79

Closed ProcuratorTitan closed 1 month ago

ProcuratorTitan commented 2 months ago

if you are not connected in greece https://en.stoiximan.gr/sport/soccer/ this site does not work how can it be solved?

nickpadd commented 2 months ago

My guess is that using a VPN that masks your IP as a greek one would provide access to the website, thus making the script work.

If this is not possible the script continues and produces the figures without including the bookmaker data. I would help anyone willing to make a web-scraper for their county's bookmaker in order to support the existing pipeline. I would also listen to any other suggestions!

ProcuratorTitan commented 2 months ago

the program does not work if it does not have the quotas, making predictions I remain stuck at 09/15/2024 it does not go forward so I assume that the VPN is needed

nickpadd commented 2 months ago

Can you please paste the error?

ProcuratorTitan commented 2 months ago

@MacBook-Pro-di EuropeanFootballLeaguePredictor % /Users/Downloads/aaa/bin/py thon /Users/Downloads/aaa/ciro/ciro/EuropeanFootballLeaguePredictor/run_predictions.py 2024-09-30 13:52:59.449 | SUCCESS | europeanfootballleaguepredictor.common.config_parser:load_and_extract_yaml_section:38 - Successfully loaded the config.yaml 2024-09-30 13:52:59.452 | SUCCESS | europeanfootballleaguepredictor.common.config_parser:load_and_extract_yaml_section:47 - Successfully loaded europeanfootballleaguepredictor/data/leagues/Serie_A/dictionaries/bookmaker.yaml 2024-09-30 13:52:59.453 | SUCCESS | europeanfootballleaguepredictor.common.config_parser:load_and_extract_yaml_section:47 - Successfully loaded europeanfootballleaguepredictor/data/leagues/Serie_A/dictionaries/data_co_uk.yaml 2024-09-30 13:52:59.457 | SUCCESS | europeanfootballleaguepredictor.common.config_parser:load_and_extract_yaml_section:47 - Successfully loaded europeanfootballleaguepredictor/data/leagues/Serie_A/dictionaries/fixture_download.yaml 2024-09-30 13:52:59.458 | INFO | main:main:37 - Configuration(league='Serie_A', regressor=<class 'sklearn.linear_model._glm.glm.PoissonRegressor'>, bettor_bank=60, bettor_kelly_cap=0.05, evaluation_output='europeanfootballleaguepredictor/data/leagues/Serie_A/evaluation/', months_of_form_list=[None, 3], database='europeanfootballleaguepredictor/data/database/Serie_A_database.db', seasons_to_gather=['2017', '2018', '2019', '2020', '2021', '2022', '2023', '2024'], current_season='2024', data_co_uk_path='europeanfootballleaguepredictor/data/leagues/Serie_A/DataCoUkFiles/', bookmaker_url='https://en.stoiximan.gr/sport/soccer/italy/serie-a/1635/', bookmaker_dictionary={'US Lecce': 'Lecce', 'AC Milan': 'AC Milan', 'Juventus FC': 'Juventus', 'Cagliari Calcio': 'Cagliari', 'AC Monza': 'Monza', 'Torino FC': 'Torino', 'SSC Napoli': 'Napoli', 'Empoli FC': 'Empoli', 'Udinese Calcio': 'Udinese', 'Atalanta': 'Atalanta', 'AC Fiorentina': 'Fiorentina', 'Bologna FC': 'Bologna', 'SS Lazio': 'Lazio', 'AS Roma': 'Roma', 'Inter Milan': 'Inter', 'Frosinone': 'Frosinone', 'US Salernitana 1919': 'Salernitana', 'US Sassuolo Calcio': 'Sassuolo', 'Genoa CFC': 'Genoa', 'Hellas Verona': 'Verona'}, data_co_uk_url='https://www.football-data.co.uk/mmz4281/2425/I1.csv', data_co_uk_dictionary={'Milan': 'AC Milan', 'Spal': 'SPAL 2013', 'Parma': 'Parma Calcio 1913'}, fixture_download_url='https://fixturedownload.com/download/serie-a-2024-UTC.csv', fixture_download_dictionary={'Empoli': 'Empoli', 'Frosinone': 'Frosinone', 'Genoa': 'Genoa', 'Inter': 'Inter', 'Roma': 'Roma', 'Sassuolo': 'Sassuolo', 'Lecce': 'Lecce', 'Udinese': 'Udinese', 'Torino': 'Torino', 'Bologna': 'Bologna', 'Monza': 'Monza', 'Milan': 'AC Milan', 'Hellas Verona': 'Verona', 'Fiorentina': 'Fiorentina', 'Juventus': 'Juventus', 'Lazio': 'Lazio', 'Napoli': 'Napoli', 'Salernitana': 'Salernitana', 'Cagliari': 'Cagliari', 'Atalanta': 'Atalanta'}, voting_dict={'long_term': 0.6, 'short_term': 0.4}, matchdays_to_drop=10) 2024-09-30 13:52:59.528 | INFO | europeanfootballleaguepredictor.data.database_handler:get_data:50 - Data fetched for table: Preprocessed_ShortTermForm 2024-09-30 13:52:59.586 | INFO | europeanfootballleaguepredictor.data.database_handler:get_data:50 - Data fetched for table: Preprocessed_LongTermForm 2024-09-30 13:52:59.590 | INFO | europeanfootballleaguepredictor.data.database_handler:get_data:50 - Data fetched for table: Preprocessed_UpcomingShortTerm 2024-09-30 13:52:59.595 | INFO | europeanfootballleaguepredictor.data.database_handler:get_data:50 - Data fetched for table: Preprocessed_UpcomingLongTerm 2024-09-30 13:53:00.028 | INFO | main:main:48 - Match_id Date HomeTeam ... Under1.5Probability GGProbability NGProbability 0 07f0ba49-a096-5240-a0ce-7261eb836eed 15/09/2024 Atalanta ... 0.22 0.57 0.43 1 78b38eb0-570a-5964-b75b-51b0d983f733 15/09/2024 Monza ... 0.34 0.44 0.56 2 b4497a07-8072-5d45-ab62-ef70afc48f2c 15/09/2024 Empoli ... 0.35 0.41 0.59 3 12c712c6-88b5-591a-b288-599eecec3d9b 15/09/2024 Torino ... 0.24 0.51 0.49 4 691bde2a-40bb-56f8-8a4e-9bf6c622a9d7 15/09/2024 Cagliari ... 0.24 0.56 0.44 5 05ba4bbe-f87a-510c-9479-97856f159581 15/09/2024 Genoa ... 0.34 0.45 0.55 6 3dc6c8fb-5910-52cd-86ce-ca3c251e319e 15/09/2024 Lazio ... 0.30 0.45 0.55

[7 rows x 26 columns] Index(['Match_id', 'Date', 'HomeTeam', 'AwayTeam', 'HomeWinOdds', 'DrawOdds', 'AwayWinOdds', 'Line', 'OverLineOdds', 'UnderLineOdds', 'Yes', 'No', 'ScorelineProbability', 'HomeWinProbability', 'DrawProbability', 'AwayWinProbability', 'Over2.5Probability', 'Under2.5Probability', 'Over3.5Probability', 'Under3.5Probability', 'Over4.5Probability', 'Under4.5Probability', 'Over1.5Probability', 'Under1.5Probability', 'GGProbability', 'NGProbability'], dtype='object')it's not a mistake he doesn't go forward in the dates as the prediction always remains at 09/15

nickpadd commented 2 months ago

I have examined the above and it seems to work as expected. I explain below the steps you should take:

  1. Run the updates script to gather the upcoming matches and update teams' statistics: python run_updates.py

  2. Run the predictions script to predict the now updated upcoming matches: python run_predictions.py

  3. The prediction figure and table for the league specified in the configuration file should be updated with the upcoming match predictions.

Please follow the above steps and let me know!

ProcuratorTitan commented 2 months ago

MBP-di EuropeanFootballLeaguePredictor % /Users/Downloads/aaa/bin/python /Users/Downl oads/aaa/EuropeanFootballLeaguePredictor/run_updates.py 2024-10-03 05:51:45.647 | SUCCESS | europeanfootballleaguepredictor.common.config_parser:load_and_extract_yaml_section:38 - Successfully loaded the config.yaml 2024-10-03 05:51:45.650 | SUCCESS | europeanfootballleaguepredictor.common.config_parser:load_and_extract_yaml_section:47 - Successfully loaded europeanfootballleaguepredictor/data/leagues/Serie_A/dictionaries/bookmaker.yaml 2024-10-03 05:51:45.651 | SUCCESS | europeanfootballleaguepredictor.common.config_parser:load_and_extract_yaml_section:47 - Successfully loaded europeanfootballleaguepredictor/data/leagues/Serie_A/dictionaries/data_co_uk.yaml 2024-10-03 05:51:45.661 | SUCCESS | europeanfootballleaguepredictor.common.config_parser:load_and_extract_yaml_section:47 - Successfully loaded europeanfootballleaguepredictor/data/leagues/Serie_A/dictionaries/fixture_download.yaml 2024-10-03 05:51:45.664 | INFO | main:main:35 - Configuration(league='Serie_A', regressor=<class 'sklearn.linear_model._glm.glm.PoissonRegressor'>, bettor_bank=60, bettor_kelly_cap=0.05, evaluation_output='europeanfootballleaguepredictor/data/leagues/Serie_A/evaluation/', months_of_form_list=[None, 3], database='europeanfootballleaguepredictor/data/database/Serie_A_database.db', seasons_to_gather=['2017', '2018', '2019', '2020', '2021', '2022', '2023', '2024'], current_season='2024', data_co_uk_path='europeanfootballleaguepredictor/data/leagues/Serie_A/DataCoUkFiles/', bookmaker_url='https://en.stoiximan.gr/sport/soccer/italy/serie-a/1635/', bookmaker_dictionary={'US Lecce': 'Lecce', 'AC Milan': 'AC Milan', 'Juventus FC': 'Juventus', 'Cagliari Calcio': 'Cagliari', 'AC Monza': 'Monza', 'Torino FC': 'Torino', 'SSC Napoli': 'Napoli', 'Empoli FC': 'Empoli', 'Udinese Calcio': 'Udinese', 'Atalanta': 'Atalanta', 'AC Fiorentina': 'Fiorentina', 'Bologna FC': 'Bologna', 'SS Lazio': 'Lazio', 'AS Roma': 'Roma', 'Inter Milan': 'Inter', 'Frosinone': 'Frosinone', 'US Salernitana 1919': 'Salernitana', 'US Sassuolo Calcio': 'Sassuolo', 'Genoa CFC': 'Genoa', 'Hellas Verona': 'Verona'}, data_co_uk_url='https://www.football-data.co.uk/mmz4281/2425/I1.csv', data_co_uk_dictionary={'Milan': 'AC Milan', 'Spal': 'SPAL 2013', 'Parma': 'Parma Calcio 1913'}, fixture_download_url='https://fixturedownload.com/download/serie-a-2024-UTC.csv', fixture_download_dictionary={'Empoli': 'Empoli', 'Frosinone': 'Frosinone', 'Genoa': 'Genoa', 'Inter': 'Inter', 'Roma': 'Roma', 'Sassuolo': 'Sassuolo', 'Lecce': 'Lecce', 'Udinese': 'Udinese', 'Torino': 'Torino', 'Bologna': 'Bologna', 'Monza': 'Monza', 'Milan': 'AC Milan', 'Hellas Verona': 'Verona', 'Fiorentina': 'Fiorentina', 'Juventus': 'Juventus', 'Lazio': 'Lazio', 'Napoli': 'Napoli', 'Salernitana': 'Salernitana', 'Cagliari': 'Cagliari', 'Atalanta': 'Atalanta'}, voting_dict={'long_term': 0.6, 'short_term': 0.4}, matchdays_to_drop=10) Traceback (most recent call last): File "/Users/Downloads/aaa/EuropeanFootballLeaguePredictor/run_updates.py", line 81, in main() File "/Users/Downloads/aaa/EuropeanFootballLeaguePredictor/run_updates.py", line 39, in main bookmaker_scraper = BookmakerScraper(url = config.bookmaker_url, dictionary = config.bookmaker_dictionary) File "/Users/Downloads/aaa/EuropeanFootballLeaguePredictor/europeanfootballleaguepredictor/data/bookmaker_scraper.py", line 48, in init self.driver = uc.Chrome(version_main = 127) File "/Users/Downloads/aaa/lib/python3.10/site-packages/undetected_chromedriver/init.py", line 466, in init super(Chrome, self).init( File "/Users/Downloads/aaa//lib/python3.10/site-packages/selenium/webdriver/chrome/webdriver.py", line 45, in init super().init( File "/Users/Downloads/aaa/lib/python3.10/site-packages/selenium/webdriver/chromium/webdriver.py", line 66, in init super().init(command_executor=executor, options=options) File "/Users/Downloads/aaa/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 212, in init self.start_session(capabilities) File "/Users/Downloads/aaa/lib/python3.10/site-packages/undetected_chromedriver/init.py", line 724, in start_session super(selenium.webdriver.chrome.webdriver.WebDriver, self).start_session( File "/Users/Downloads/aaa/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 299, in start_session response = self.execute(Command.NEW_SESSION, caps)["value"] File "/Users/Downloads/aaa/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 354, in execute self.error_handler.check_response(response) File "/Users/Downloads/aaa/lib/python3.10/site-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.SessionNotCreatedException: Message: session not created: cannot connect to chrome at 127.0.0.1:64688 from session not created: This version of ChromeDriver only supports Chrome version 127 Current browser version is 129.0.6668.70 Stacktrace: 0 undetected_chromedriver 0x000000010eb868b8 undetected_chromedriver + 5179576 1 undetected_chromedriver 0x000000010eb7e2ea undetected_chromedriver + 5145322 2 undetected_chromedriver 0x000000010e6f52b0 undetected_chromedriver + 389808 3 undetected_chromedriver 0x000000010e7317b5 undetected_chromedriver + 636853 4 undetected_chromedriver 0x000000010e730990 undetected_chromedriver + 633232 5 undetected_chromedriver 0x000000010e7269c4 undetected_chromedriver + 592324 6 undetected_chromedriver 0x000000010e76f67e undetected_chromedriver + 890494 7 undetected_chromedriver 0x000000010e763553 undetected_chromedriver + 841043 8 undetected_chromedriver 0x000000010e7347f6 undetected_chromedriver + 649206 9 undetected_chromedriver 0x000000010e73505e undetected_chromedriver + 651358 10 undetected_chromedriver 0x000000010eb49b20 undetected_chromedriver + 4930336 11 undetected_chromedriver 0x000000010eb4ea36 undetected_chromedriver + 4950582 12 undetected_chromedriver 0x000000010eb4f105 undetected_chromedriver + 4952325 13 undetected_chromedriver 0x000000010eb2bee9 undetected_chromedriver + 4808425 14 undetected_chromedriver 0x000000010eb4f3f9 undetected_chromedriver + 4953081 15 undetected_chromedriver 0x000000010eb1d844 undetected_chromedriver + 4749380 16 undetected_chromedriver 0x000000010eb6e5c8 undetected_chromedriver + 5080520 17 undetected_chromedriver 0x000000010eb6e787 undetected_chromedriver + 5080967 18 undetected_chromedriver 0x000000010eb7dece undetected_chromedriver + 5144270 19 libsystem_pthread.dylib 0x00007ff81725d1d3 _pthread_start + 125 20 libsystem_pthread.dylib 0x00007ff817258bd3 thread_start + 15 this is the update file @MBP-di EuropeanFootballLeaguePredictor % /Users/Downloads/aaa/bin/python /Us ers/Downloads/aaa/EuropeanFootballLeaguePredictor/run_predictions.py 2024-10-03 05:53:13.487 | SUCCESS | europeanfootballleaguepredictor.common.config_parser:load_and_extract_yaml_section:38 - Successfully loaded the config.yaml 2024-10-03 05:53:13.490 | SUCCESS | europeanfootballleaguepredictor.common.config_parser:load_and_extract_yaml_section:47 - Successfully loaded europeanfootballleaguepredictor/data/leagues/Serie_A/dictionaries/bookmaker.yaml 2024-10-03 05:53:13.491 | SUCCESS | europeanfootballleaguepredictor.common.config_parser:load_and_extract_yaml_section:47 - Successfully loaded europeanfootballleaguepredictor/data/leagues/Serie_A/dictionaries/data_co_uk.yaml 2024-10-03 05:53:13.496 | SUCCESS | europeanfootballleaguepredictor.common.config_parser:load_and_extract_yaml_section:47 - Successfully loaded europeanfootballleaguepredictor/data/leagues/Serie_A/dictionaries/fixture_download.yaml 2024-10-03 05:53:13.497 | INFO | main:main:37 - Configuration(league='Serie_A', regressor=<class 'sklearn.linear_model._glm.glm.PoissonRegressor'>, bettor_bank=60, bettor_kelly_cap=0.05, evaluation_output='europeanfootballleaguepredictor/data/leagues/Serie_A/evaluation/', months_of_form_list=[None, 3], database='europeanfootballleaguepredictor/data/database/Serie_A_database.db', seasons_to_gather=['2017', '2018', '2019', '2020', '2021', '2022', '2023', '2024'], current_season='2024', data_co_uk_path='europeanfootballleaguepredictor/data/leagues/Serie_A/DataCoUkFiles/', bookmaker_url='https://en.stoiximan.gr/sport/soccer/italy/serie-a/1635/', bookmaker_dictionary={'US Lecce': 'Lecce', 'AC Milan': 'AC Milan', 'Juventus FC': 'Juventus', 'Cagliari Calcio': 'Cagliari', 'AC Monza': 'Monza', 'Torino FC': 'Torino', 'SSC Napoli': 'Napoli', 'Empoli FC': 'Empoli', 'Udinese Calcio': 'Udinese', 'Atalanta': 'Atalanta', 'AC Fiorentina': 'Fiorentina', 'Bologna FC': 'Bologna', 'SS Lazio': 'Lazio', 'AS Roma': 'Roma', 'Inter Milan': 'Inter', 'Frosinone': 'Frosinone', 'US Salernitana 1919': 'Salernitana', 'US Sassuolo Calcio': 'Sassuolo', 'Genoa CFC': 'Genoa', 'Hellas Verona': 'Verona'}, data_co_uk_url='https://www.football-data.co.uk/mmz4281/2425/I1.csv', data_co_uk_dictionary={'Milan': 'AC Milan', 'Spal': 'SPAL 2013', 'Parma': 'Parma Calcio 1913'}, fixture_download_url='https://fixturedownload.com/download/serie-a-2024-UTC.csv', fixture_download_dictionary={'Empoli': 'Empoli', 'Frosinone': 'Frosinone', 'Genoa': 'Genoa', 'Inter': 'Inter', 'Roma': 'Roma', 'Sassuolo': 'Sassuolo', 'Lecce': 'Lecce', 'Udinese': 'Udinese', 'Torino': 'Torino', 'Bologna': 'Bologna', 'Monza': 'Monza', 'Milan': 'AC Milan', 'Hellas Verona': 'Verona', 'Fiorentina': 'Fiorentina', 'Juventus': 'Juventus', 'Lazio': 'Lazio', 'Napoli': 'Napoli', 'Salernitana': 'Salernitana', 'Cagliari': 'Cagliari', 'Atalanta': 'Atalanta'}, voting_dict={'long_term': 0.6, 'short_term': 0.4}, matchdays_to_drop=10) 2024-10-03 05:53:13.627 | INFO | europeanfootballleaguepredictor.data.database_handler:get_data:50 - Data fetched for table: Preprocessed_ShortTermForm 2024-10-03 05:53:13.684 | INFO | europeanfootballleaguepredictor.data.database_handler:get_data:50 - Data fetched for table: Preprocessed_LongTermForm 2024-10-03 05:53:13.688 | INFO | europeanfootballleaguepredictor.data.database_handler:get_data:50 - Data fetched for table: Preprocessed_UpcomingShortTerm 2024-10-03 05:53:13.693 | INFO | europeanfootballleaguepredictor.data.database_handler:get_data:50 - Data fetched for table: Preprocessed_UpcomingLongTerm 2024-10-03 05:53:14.092 | INFO | main:main:48 - Match_id Date HomeTeam ... Under1.5Probability GGProbability NGProbability 0 07f0ba49-a096-5240-a0ce-7261eb836eed 15/09/2024 Atalanta ... 0.22 0.57 0.43 1 78b38eb0-570a-5964-b75b-51b0d983f733 15/09/2024 Monza ... 0.34 0.44 0.56 2 b4497a07-8072-5d45-ab62-ef70afc48f2c 15/09/2024 Empoli ... 0.35 0.41 0.59 3 12c712c6-88b5-591a-b288-599eecec3d9b 15/09/2024 Torino ... 0.24 0.51 0.49 4 691bde2a-40bb-56f8-8a4e-9bf6c622a9d7 15/09/2024 Cagliari ... 0.24 0.56 0.44 5 05ba4bbe-f87a-510c-9479-97856f159581 15/09/2024 Genoa ... 0.34 0.45 0.55 6 3dc6c8fb-5910-52cd-86ce-ca3c251e319e 15/09/2024 Lazio ... 0.30 0.45 0.55

[7 rows x 26 columns] Index(['Match_id', 'Date', 'HomeTeam', 'AwayTeam', 'HomeWinOdds', 'DrawOdds', 'AwayWinOdds', 'Line', 'OverLineOdds', 'UnderLineOdds', 'Yes', 'No', 'ScorelineProbability', 'HomeWinProbability', 'DrawProbability', 'AwayWinProbability', 'Over2.5Probability', 'Under2.5Probability', 'Over3.5Probability', 'Under3.5Probability', 'Over4.5Probability', 'Under4.5Probability', 'Over1.5Probability', 'Under1.5Probability', 'GGProbability', 'NGProbability'], dtype='object') this is the python file run_predictions.py anyway you did a great job, honor to you

Allen-Zengr commented 2 months ago

The limitations of the en.stoiximan.gr website are too great, and it requires a Greek address. You might consider switching to www.soccer-rating.com instead. :)

nickpadd commented 2 months ago

@ProcuratorTitan I have updated the script as it seemed to be catching a browser version issue. I think following these steps will fix the problem:

  1. Pull the changes: git pull

  2. Run updates: python run_updates.py

  3. Run predictions: python run_predictions.py

Please let me know if it is fixed, as I think this was the issue.

ProcuratorTitan commented 2 months ago

do I download the entire master file again?

nickpadd commented 2 months ago

If you: git pull it should be fine.

If you want to make sure you can do it from scratch.

nickpadd commented 1 month ago

If it still does not update you can refer in #81 an issue with the upcoming matches not updating. But this is separate of scraping the bookmaker. The upcoming matches should update anyways.