colinrsmall / ehm_roster_tools

Notebooks and tools for scraping sites for EHM rosters and facepics.
2 stars 0 forks source link

Roster Scraper: -Draft- Not pulling data unless player is under contract #24

Closed flinch44 closed 2 years ago

flinch44 commented 2 years ago

Previous versions pulled everyone drafted, but most recent patch has changed to only pulling players currently with a 2021-22 contract regardless of options selected

xECK29x commented 2 years ago

May be related but attempting to scrape WHL US Prospect Draft (2021) is failing with nothing but errors:

leagues = "https://www.eliteprospects.com/draft/whl-us-prospect-draft/2021" #@param {type:"string"} leagues = leagues.split(',')

season = "2021-22" #@param {type:"string"} contract_expiry_prefix = "30.4.XXXX" #@param {type:"string"} show_error_links = True #@param {type:"boolean"} make_junior_contracts_to_age_20 = False #@param {type:"boolean"} scrape_international_games = True #@param {type:"boolean"} skip_players_with_blank_dobs = True #@param {type:"boolean"} use_google_drive = False #@param {type:"boolean"} calculate_remaining_eligible_years = False #@param {type:"boolean"} override_contract_for_nhl_prospects = False #@param {type:"boolean"} leagues_for_eligibility = "USports NCAA ACHA" #@param {type:"string"} nhl_contracts = 'Include and set NHL team to playing' #@param ["Skip players with NHL contracts", "Include and set NHL team to playing", "Keep current team as playing"]

Traceback (most recent call last): File "", line 60, in scrape first_name, last_name, team, league, dob, birth_place, primary_nation, secondary_nation, declared_nation, position, height, weight, shoots, contract_expiry, contracted_team, join_date, intl_games, intl_g, intl_a = scrape_player_page(link) File "", line 344, in scrape_player_page first_name, last_name = get_name(player_page) File "", line 16, in get_name name = name = playerpage.find('div', class='ep-entity-header__name').text.strip() AttributeError: 'NoneType' object has no attribute 'text' Missing player information for: https://www.eliteprospects.com/player/704492/hayden-hastings Traceback (most recent call last): File "", line 60, in scrape first_name, last_name, team, league, dob, birth_place, primary_nation, secondary_nation, declared_nation, position, height, weight, shoots, contract_expiry, contracted_team, join_date, intl_games, intl_g, intl_a = scrape_player_page(link) File "", line 344, in scrape_player_page first_name, last_name = get_name(player_page) File "", line 16, in get_name name = name = playerpage.find('div', class='ep-entity-header__name').text.strip() AttributeError: 'NoneType' object has no attribute 'text' Missing player information for: https://www.eliteprospects.com/player/784832/tyler-atchison

colinrsmall commented 2 years ago

Fixed the issue with the WHL draft. I think it was just related to EP changing their site html.

@flinch44 It looks like the draft scraper is correctly pulling players without 2021-22 contracts since the WHL draft looks to be working. Do you have an example draft that it's not working for that I can try out?

xECK29x commented 2 years ago

Was a fixed pushed to prod for testing? Still getting an error on each player

Traceback (most recent call last): File "", line 60, in scrape first_name, last_name, team, league, dob, birth_place, primary_nation, secondary_nation, declared_nation, position, height, weight, shoots, contract_expiry, contracted_team, join_date, intl_games, intl_g, intl_a = scrape_player_page(link) File "", line 344, in scrape_player_page first_name, last_name = get_name(player_page) File "", line 16, in get_name name = name = playerpage.find('div', class='ep-entity-headername').text.strip() AttributeError: 'NoneType' object has no attribute 'text' Missing player information for: https://www.eliteprospects.com/player/784826/carson-mcginley Traceback (most recent call last): File "", line 60, in scrape first_name, last_name, team, league, dob, birth_place, primary_nation, secondary_nation, declared_nation, position, height, weight, shoots, contract_expiry, contracted_team, join_date, intl_games, intl_g, intl_a = scrape_player_page(link) File "", line 344, in scrape_player_page first_name, last_name = get_name(player_page) File "", line 16, in get_name name = name = playerpage.find('div', class='ep-entity-header__name').text.strip() AttributeError: 'NoneType' object has no attribute 'text' Missing player information for: https://www.eliteprospects.com/player/750629/tyler-mcgowan Traceback (most recent call last): File "", line 60, in scrape first_name, last_name, team, league, dob, birth_place, primary_nation, secondary_nation, declared_nation, position, height, weight, shoots, contract_expiry, contracted_team, join_date, intl_games, intl_g, intl_a = scrape_player_page(link) File "", line 344, in scrape_player_page first_name, last_name = get_name(player_page) File "", line 16, in get_name name = name = playerpage.find('div', class='ep-entity-headername').text.strip() AttributeError: 'NoneType' object has no attribute 'text' Missing player information for: https://www.eliteprospects.com/player/597781/trevor-connelly

colinrsmall commented 2 years ago

Whoops, definitely forgot to push the fixes to the Github. Should be fixed now.