BurntSushi / nfldb

A library to manage and update NFL data in a relational database.
The Unlicense
1.07k stars 263 forks source link

GSIS ID 2019090808 (CIN @ SEA) stored incomplete JSON to database, unable to replace #312

Closed JoshADHD closed 4 years ago

JoshADHD commented 4 years ago

Initially, nfldb/nflgame pulled in a busted JSON file for the week 1 game between CIN and SEA... this malformed data was pushed to the database and now when I query play_player, nothing is returned.

I have run, in order:

update_sched.py --year 2019 nfldb-update --update-schedules nfldb-update

This refreshed the JSON file inside nflgame/gamecenter-json just fine, but the database tables appear unaffected.

I have also attempted to simulate the game in order to restore it to the database (as recommended in other issues):

nfldb-update --simulate 2019090808

However, this returns an error that I cannot sort out.

Traceback (most recent call last):
File "/usr/local/bin/nfldb-update", line 39, in <module>
    nfldb.update.run(**vars(args))
File "/usr/local/lib/python2.7/dist-packages/nfldb/update.py", line 458, in run
    games = sorted(q.as_games(), key=lambda g: g.gsis_id)
File "/usr/local/lib/python2.7/dist-packages/nfldb/query.py", line 793, in as_games
    q = self._make_join_query(cursor, types.Game)
File "/usr/local/lib/python2.7/dist-packages/nfldb/query.py", line 762, in _make_join_query
    'where': sql.ands(self._sql_where(cursor)),
File "/usr/local/lib/python2.7/dist-packages/nfldb/query.py", line 1031, in _sql_where
    aliases=aliases, aggregate=aggregate)
File "/usr/local/lib/python2.7/dist-packages/nfldb/query.py", line 292, in _disjunctions
    ds.append(' AND '.join('(%s)' % sql(c) for c in conjuncts))
File "/usr/local/lib/python2.7/dist-packages/nfldb/query.py", line 292, in <genexpr>
    ds.append(' AND '.join('(%s)' % sql(c) for c in conjuncts))
File "/usr/local/lib/python2.7/dist-packages/nfldb/query.py", line 289, in sql
    return c._sql_where(cursor, aliases=aliases, aggregate=aggregate)
File "/usr/local/lib/python2.7/dist-packages/nfldb/query.py", line 352, in _sql_where
    'Disjunctions must use "=" for column "%s"' % field
AssertionError: Disjunctions must use "=" for column "game.gsis_id"
ochawkeye commented 4 years ago

I notice this on occasion.

What I do is delete the game from the database with something like this:

import nfldb
db = nfldb.connect()

q = nfldb.Query(db)
# q.game(season_year=2019, season_type='Regular', week=1)
#  gsis_ids = [game.gsis_id for game in q.as_games()]
gsis_ids = ['2019090808']

for gsis_id in gsis_ids:
    query = "DELETE FROM game where gsis_id = '{}';".format(gsis_id)
    with nfldb.Tx(db) as cursor:
        cursor.execute(query)  

Then locate and delete the game from your ..\Lib\site-packages\nflgame\gamecenter-json.

I actually do this for all games the morning after games have been played since there have been so many occasions where play-by-play has mistakes at time of the game ending (usually for plays where reviewed occurred?) but once the games end nflgame no longer goes back to check for updates.

brbeaird commented 4 years ago

Yep, I had this same scenario for the IND/LAC game. Clearing out the JSON files and DB tables and doing a fresh update fixes it. I'll probably just write a script to do it like @ochawkeye mentioned.

On a related note, are you guys using the Python3 version of nflgame with nfldb? I see where it's supposed to be a "drop in replacement" for the Python2 version, but I'm not sure how that would work as far as pointing nfldb to the right version. I've not gotten deep enough into Python to know if nfldb would need to have its dependencies updated somehow...wouldn't pip install just fall back to the Python2.7 version otherwise?

JoshADHD commented 4 years ago

That little script did the trick. Thanks, @ochawkeye!

I've stuck with the 2.7 version of nflgame-redux for the time being. It's not broken (for the most part) so I see no reason to move at this time.

ochawkeye commented 4 years ago

On a related note, are you guys using the Python3 version of nflgame with nfldb? I see where it's supposed to be a "drop in replacement" for the Python2 version

I'm not and at this point probably won't until I absolutely have to. Most of my scripts were written in the infancy of my Python programming experience and are entirely too ugly for me to try to step through now.

When the day comes that I move to Python 3, I'll have to rewrite those 6 year old scripts from the ground up.

It's just easier for me to maintain the house of cards that rebuild from scratch for now.