jldbc / pybaseball

Pull current and historical baseball statistics using Python (Statcast, Baseball Reference, FanGraphs)
MIT License
1.25k stars 333 forks source link

playerid_lookup and _reverse_lookup failing with a pandas Usecols ValueError #321

Closed cqstanford closed 1 year ago

cqstanford commented 1 year ago

Hello!

Ran part of a project using playerid_lookup for the first time in a few months, and it seems unable to run. Tried the simplest form (as below) and got the following output.

from pybaseball import playerid_lookup

print(playerid_lookup("jones"))
Gathering player lookup table. This may take a moment.
Traceback (most recent call last):
  File "c:\Users\retro\Desktop\Atom\ballin\statcast-outdated\names.py", line 18, in <module>
    print(playerid_lookup("jones"))
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pybaseball\playerid_lookup.py", line 175, in playerid_lookup        
    client = _get_client()
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pybaseball\playerid_lookup.py", line 161, in _get_client
    _client = _PlayerSearchClient()
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pybaseball\playerid_lookup.py", line 79, in __init__
    self.table = get_lookup_table()
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pybaseball\playerid_lookup.py", line 52, in get_lookup_table        
    table = chadwick_register(save)
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pybaseball\cache\cache.py", line 58, in _cached
    result = func(*args, **kwargs)
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pybaseball\playerid_lookup.py", line 33, in chadwick_register       
    table = pd.read_csv(io.StringIO(s.decode('utf-8')), usecols=cols_to_keep)
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 680, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 575, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 934, in __init__
    self._engine = self._make_engine(f, self.engine)
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\readers.py", line 1236, in _make_engine
    return mapping[engine](f, **self.options)
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 131, in __init__       
    self._validate_usecols_names(usecols, self.orig_names)
  File "C:\Users\retro\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\io\parsers\base_parser.py", line 913, in _validate_usecols_names
    raise ValueError(
ValueError: Usecols do not match columns, columns expected but not found: ['key_bbref', 'name_last', 'key_fangraphs', 'mlb_played_first', 'mlb_played_last', 'key_mlbam', 'name_first', 'key_retro']

Can't tell if something changed on the pybaseball backend or with what pandas requires, but I haven't found anything I can change on my end to get it to work. I'd think this might be related to the other Player ID issue that tjburch posted yesterday? If this is an issue with something on my system, would love advice on how to fix it. Thank you for your time!

tjburch commented 1 year ago

Yeah it's resolved per #309. You can either wait until the next release or just run the repo version (via cloning the repo and installing locally)

cqstanford commented 1 year ago

Ah, I see. Will do, thank you so much!