colinrsmall / ehm_roster_tools

Notebooks and tools for scraping sites for EHM rosters and facepics.
2 stars 0 forks source link

Roster Scraper: Issue pulling club playing/contracted data for 2021-22 contracts #23

Closed xECK29x closed 2 years ago

xECK29x commented 2 years ago

It looks like the roster scraper is having issues pulling player details for club contracted/playing for 2021-22 for some of the Slovak leagues, I have not tried any others but it could be due to a page element change recently.

Attempting to pull contracts for:

leagues: https://www.eliteprospects.com/league/slovakia2,https://www.eliteprospects.com/league/slovakia3,https://www.eliteprospects.com/league/slovakia-u20,https://www.eliteprospects.com/league/slovakia-u18

season: 2021-22 contract_expiry_prefix: 30.4.XXXX show_error_links:

make_junior_contracts_u20:

scrape_international_games:

skip_players_with_blank_dobs:

use_google_drive:

calculate_remaining_eligible_years:

leagues_for_eligibility: USports NCAA ACHA nhl_contracts:

Skip players with NHL contracts

xECK29x commented 2 years ago

Still having notable issues for some clubs/leagues, Slovakia 2.Liga is one: https://www.eliteprospects.com/league/slovakia3

Most of the errors appear to be the same:

Missing player information for: https://www.eliteprospects.com/player/177070/roman-kadlecik Traceback (most recent call last): File "", line 60, in scrape first_name, last_name, team, league, dob, birth_place, primary_nation, secondary_nation, declared_nation, position, height, weight, shoots, contract_expiry, contracted_team, join_date, intl_games, intl_g, intl_a = scrape_player_page(link) File "", line 345, in scrape_player_page contract_expiry, contracted_team, join_date = get_contracted_team(player_page) File "", line 242, in get_contracted_team raise e File "", line 219, in get_contracted_team contracted_team = transfer.select(".to > a")[0]['href'] UnboundLocalError: local variable 'transfer' referenced before assignment

colinrsmall commented 2 years ago

Just ran on Slovakia3, but I'm not seeing this issue on the scraper. Can you double check?

xECK29x commented 2 years ago

Getting an error on every player:

Players: 1% 4/466 [00:23<44:45, 5.81s/it] Traceback (most recent call last): File "", line 60, in scrape first_name, last_name, team, league, dob, birth_place, primary_nation, secondary_nation, declared_nation, position, height, weight, shoots, contract_expiry, contracted_team, join_date, intl_games, intl_g, intl_a = scrape_player_page(link) File "", line 344, in scrape_player_page first_name, last_name = get_name(player_page) File "", line 16, in get_name name = name = playerpage.find('div', class='ep-entity-headername').text.strip() AttributeError: 'NoneType' object has no attribute 'text' Missing player information for: https://www.eliteprospects.com/player/227338/matej-crkon Traceback (most recent call last): File "", line 60, in scrape first_name, last_name, team, league, dob, birth_place, primary_nation, secondary_nation, declared_nation, position, height, weight, shoots, contract_expiry, contracted_team, join_date, intl_games, intl_g, intl_a = scrape_player_page(link) File "", line 344, in scrape_player_page first_name, last_name = get_name(player_page) File "", line 16, in get_name name = name = playerpage.find('div', class='ep-entity-header__name').text.strip() AttributeError: 'NoneType' object has no attribute 'text' Missing player information for: https://www.eliteprospects.com/player/343233/tomas-hrusik Traceback (most recent call last): File "", line 60, in scrape first_name, last_name, team, league, dob, birth_place, primary_nation, secondary_nation, declared_nation, position, height, weight, shoots, contract_expiry, contracted_team, join_date, intl_games, intl_g, intl_a = scrape_player_page(link) File "", line 344, in scrape_player_page first_name, last_name = get_name(player_page) File "", line 16, in get_name name = name = playerpage.find('div', class='ep-entity-headername').text.strip() AttributeError: 'NoneType' object has no attribute 'text' Missing player information for: https://www.eliteprospects.com/player/557412/daniel-kmosena Traceback (most recent call last): File "", line 60, in scrape first_name, last_name, team, league, dob, birth_place, primary_nation, secondary_nation, declared_nation, position, height, weight, shoots, contract_expiry, contracted_team, join_date, intl_games, intl_g, intl_a = scrape_player_page(link) File "", line 344, in scrape_player_page first_name, last_name = get_name(player_page) File "", line 16, in get_name name = name = playerpage.find('div', class='ep-entity-header__name').text.strip() AttributeError: 'NoneType' object has no attribute 'text' Missing player information for: https://www.eliteprospects.com/player/653051/filip-kubiridzak

xECK29x commented 2 years ago

Working with latest code push!