derek-adair / nflgame

A working snapshot of nflgame (for historic purposes). This project is no longer active.
http://nflgame.derekadair.com
The Unlicense
331 stars 100 forks source link

PSA - Nfl data feed is down #130

Closed derek-adair closed 3 years ago

derek-adair commented 4 years ago

This renders nflgame useless at the moment. I will be looking into how we can make historic data usable in the worst case scenarios like this.

derek-adair commented 4 years ago

just so no one is panicking... There is a light at the end of the tunnel and there is a way out of this for us. There are some potentially huge breaking changes... There is a very real possibility of removing all http requests to nfl.com and relying only on data already downloaded.

derek-adair commented 4 years ago

Unlocking to invite conversation on this subject.

I've also released version 3.0.0 which will work with historic data.

andrew-shackelford commented 4 years ago

Hey,

First off: really sorry to hear that the NFL api is essentially dead, and I think I speak for everyone who's used the library to say thanks for your work on fixes to get the library to work with historic data as a stopgap.

As for moving forward, I'd be happy to help with measures to scrape the NFL website. I've got a decent amount of time to work on them given the COVID situation, and am especially invested in the live play-by-play data as I use that for a Twitter bot I run (although am happy to help on roster data or anything else too).

ThomasMorgani commented 4 years ago

I am also looking to work on this. I rely heavily on the live stats for my fantasy football league app and would really like to get this back up before the start of the season.

I am mostly JS/PHP code wise but have experience web scraping and should have no problem adapting and collaborating.

I will probably start digging around sometime soon. Is there somewhere anyone is willing to work together or should we just keep it all here?

JimHewitt commented 4 years ago

The real value of the NFL feeds was the statids included with each play. It would be difficult to derive that info from the play by play text.

tomweingarten commented 4 years ago

I haven't used the new API yet. But if I understand correctly, it looks like the playStatType enum takes the place of statids in the new API? It looks to me like it should be fairly easy to convert between the two. I'd bet on the backend they use the same integer values to represent the statids, if we're lucky they might expose them in the API.

justin-haight commented 4 years ago

I ended up here in search of info when I realized the old NFL API was removed. Since I haven't used nflgame (yet) I'm not sure if this will be useful to you but today I did figure out how to use their new API to retrieve schedules, scores, and I think at least some stats. It looks like they're using a revised version of the API they documented in ~2015 at https://api.nfl.com/docs

Since I'm not sure that I'll have the time to familiarize myself with nflgame and create a proper pull request, I thought I'd share what I found here. This gist will connect to the new NFL API, retrieve an access token, and then uses the access token to pull down the 2020 week 1 schedule. I was also able to successfully retrieve the equivalent of 2019 week 1 score strip.

andrew-shackelford commented 4 years ago

This new API does look promising...I’m not that familiar with the nflgame source code either and wouldn’t know where to begin on implementing, but it looks like this could replace the existing functionality in every way? (correct me if I’m wrong) I’d probably defer to Derek or somebody else to be in charge of this effort, but still happy to help in whatever way I can.

JimHewitt commented 4 years ago

@justin-haight Was wondering what you used as a reference.

justin-haight commented 4 years ago

I watched the network traffic when I loaded their scores page and then tried it out with postman. Wish I could say it was something cooler but at least I did have to work for it with all the ads and analytics traffic they have.

derek-adair commented 4 years ago

This gist will connect to the new NFL API, retrieve an access token, and then uses the access token to pull down the 2020 week 1 schedule. I was also able to successfully retrieve the equivalent of 2019 week 1 score strip.

When we extract the token from nfl.com and force it int headers we are explicitly against the NFL.com ToS, leaving us open to facing legal action.

I would like to come up with a long-term solution for this project that is 100% legal and within their Terms and Services. Personally, I do not use this library for much any longer, so its unlikely that I will be contributing code to something that will leave me open to a lawsuit.

Fail Safe Option the nflscrapr guys are publishing pbp data after every game. I believe its formatted differently than what nflgame likes, but its entirely possible, as a last resort, to find a way to massage this data into nflgame in order to keep play-by-play data alive. The drawback here is we no longer have detailed roster data.

derek-adair commented 4 years ago

I haven't used the new API yet.

There is no clear way to legally access this api. This api documentation page has been public for at least 3 years and I've yet to find a way to get an actual token. I believe it is reserved for actual NFL partners.

derek-adair commented 4 years ago

Also I got a gig, so I'll be indisposed this week. Really sorry but gotta pay them bills.

100% open to ideas on how to legally bring this back up and running. Again, the fail safe here is to just convert the nflscrapr data into nflgame data after a game completes.

I'll be linking the actual data repository, kinda busy right now and need to get back to it.

brayellison commented 4 years ago

I've started writing a Scala/Spark program to concatenate all the files form nflscrapR and push them to a Postgres database. It's currently only creating parquet files, but here's the link if anyone's interested.

BrianT71 commented 4 years ago

This gist will connect to the new NFL API, retrieve an access token, and then uses the access token to pull down the 2020 week 1 schedule.

The link for your gist gives a 404 error. Is it still available?

justin-haight commented 4 years ago

@BrianT71 Sorry I removed the public gist after Derek pointed out I was using some IDs in my request which are required to make it work. They probably didn't intend to make those public and I wasn't sure what else it might expose.

toddrob99 commented 4 years ago

@justin-haight were you doing something other than calling the re****e endpoint with specific headers and grant type in the body? (redacted endpoint name for same reason you removed the gist)

BrianT71 commented 4 years ago

I've started deconstructing the new API and found all of the game data (game information, play by play and player stats) but I have not been able to find anything on pulling rosters (or even all players) to get their team info and any IDs. I previously scrapped this from the roster page (www.nfl.com/teams/arizona-cardinals/roster) but the update version no longer has ID numbers in the player links. It just uses the players name as the link.

If anyone has found any way to get player info and new IDs I would appreciate some guidance.

justin-haight commented 4 years ago

@toddrob99 nope that's all I was doing.

toddrob99 commented 4 years ago

@BrianT71 Assuming you are referring to api.nfl.com when you say "new API," you can get person ids from the teams endpoint by including roster{id} in your field selector. The below URL will give you the roster for the 2019 Cardinals. The teams endpoint does not seem to be working when 2020 team ids are specified, instead throwing a 500 error stating Name is null. I assume (hope) this will be sorted out when the season starts. Not sure what Name is referring to.

https://api.nfl.com/v1/teams/10043800-2019-3db6-e772-19c9ba3f535c?fs={id,season,fullName,nickName,abbr,type,cityStateRegion,conference{abbr},division{abbr},roster{id,type,firstName,lastName,displayName,homeTown,college{id,name,type},highSchool,activeRole,player,coach},injuries}

BrianT71 commented 4 years ago

@toddrob99 Sorry, I was sloppy in my wording. I was referring to the v3 API which is what the current website (nfl.com) is using. Since there is no documentation for v3 that I can find, I am stuck using educated guesses for the endpoint names and fields. I was hoping someone else may have already figured this part out. I'll be banging away at it over the weekend.

toddrob99 commented 4 years ago

@BrianT71, sorry I didn't make the connection between v3 and new; I've just been referring to /v3/shield as shield queries.

Try this: https://api.nfl.com/v3/shield/?query=query%7Bviewer%7Bteam(id%3A%2210043800-2020-6de2-d4a8-fc8cf0e8a3ad%22)%7Bid%20abbreviation%20fullName%20id%20nickName%20cityStateRegion%20franchise%7Bid%20slug%20currentLogo%7Burl%7D%7D%20season%7Bid%20season%7D%20division%20players%7Bid%20status%20position%20jerseyNumber%20gsisId%20esbId%20person%7BfirstName%20lastName%20displayName%20highSchool%7D%7D%7D%7D%7D&variables=null

Decoded query param:

query{viewer{team(id:"10043800-2020-6de2-d4a8-fc8cf0e8a3ad"){id abbreviation fullName id nickName cityStateRegion franchise{id slug currentLogo{url}} season{id season} division players{id status position jerseyNumber gsisId esbId person{firstName lastName displayName highSchool}}}}}

Note: as I look deeper into this, it appears that query is only including players with status=ACT. That doesn't help my use case of listing inactive players, but maybe it will help you.

I've gone back and forth a few times about mentioning this at all, but I've created a python nflapi wrapper that has some of the queries I'm using (will most likely add this one since my teamById method is no longer working for 2020 team ids). The reason why I've been hesitant to mention it is that it does not facilitate retrieval of a token and I do not want to deal with a ton of people asking about that. Also because I haven't created any documentation for it.

BrianT71 commented 4 years ago

Thanks. I was able to stumble into this query myself late yesterday. It will work for what I need right now. One thing I have noticed is the IDs for teams and players are changing every year. So the same player or team has a different ID for 2020 vs 2019. This is problematic for me on the player side as I have my own database of players which matches up based on ID numbers. I have pulled data from various sites over the years so I have a linking ID for each site. I've been using the gsisID for NFL.com and fortunately it's still there so I can do the matches that way for now. The "slug" field also appears to be a unique identifier so I'm going to store that as well in case I need it in the future. Just seem strange that an ID number would not be static across time for a player.

Regarding the status=ACT only, I do see a player with status != ACT in my team query (one is status=NWT).

https://api.nfl.com/v3/shield/?query=query%7Bviewer%7Bteam(id%3A%2210044500-2020-647d-4ff7-d6a678d2a29d%22)%7Bid%20season%7bid%20season%7d%20fullName%20abbreviation%20conference%20players%7Bid%20status%20position%20person%7BdisplayName%20slug%20gsisId%7D%7D%7D%7D%7D&variables=null

Decoded: query=query{viewer{team(id:"10044500-2020-647d-4ff7-d6a678d2a29d"){id season{id season} fullName abbreviation conference players{id status position person{displayName slug gsisId}}}}}&variables=null

For this team query, do you know if the injury status is available for each player? It is in the V1 documentation as "injuryStatus" for a player but I can't figure out the field name if it exist in the shield api. A more general question, is the shield api documented anywhere?

ThomasMorgani commented 4 years ago

For anyone interested in a pure scraping solution I started putting together one using node and puppeteer. Puppeteer is a library that uses a headless chromium browser for automation.

It's a little bit of a rush job but so far it is able to pull schedules, rosters and full game stats all without the need to reverse enegineer the api - which I would ultimately prefer but cant risk the rug being pulled out from under us again.

There will be substantial updates in the coming days since this will need to be fully functional come the start of the fantasy season.

Anyone who would like to contribute in any way can do so here:

https://github.com/ThomasMorgani/AmericanSportScraper

AC6y86 commented 4 years ago

This thread looks like it died down a couple months ago. Any updates on nflgame, or thoughts about if there is going to be something working by the time the season starts? Appreciate everyone is busy.

mjsz commented 4 years ago

FWIW, looks like the nfl xml is back up using a static subdomain: https://static.nfl.com/ajax/scorestrip?season=2020&seasonType=REG&week=1

JimHewitt commented 4 years ago

I just tried a random game from 2019 and the PBP is there! Not sure if this will continue into 2020 or if the format is the same. I guess we'll know later this week.

http://static.nfl.com/liveupdate/game-center/2019110700/2019110700_gtd.json

brianzhou13 commented 4 years ago

@mjsz does that mean we'll be able to pull NFL game data in the existing state of the library?

Phloot commented 3 years ago

I just tried a random game from 2019 and the PBP is there! Not sure if this will continue into 2020 or if the format is the same. I guess we'll know later this week.

http://static.nfl.com/liveupdate/game-center/2019110700/2019110700_gtd.json

@JimHewitt Great find! Have you looked into whether scraping data from this feed abides by the NFL ToS?

EDIT: I've read through the ToS and I've not yet noticed anything that would be a cause for alarm for scraping of this game-center URL. Would not mind a second set of eyes however.

mjsz commented 3 years ago

@mjsz does that mean we'll be able to pull NFL game data in the existing state of the library?

This repo will need to be udpated at least here: https://github.com/derek-adair/nflgame/blob/bafd5fb3e4d1787329bca7a93eefc4fbda12125d/nflgame/update_sched.py#L40 and here: https://github.com/derek-adair/nflgame/blob/bafd5fb3e4d1787329bca7a93eefc4fbda12125d/nflgame/game.py#L36

to reflect the new data urls from nfl.com. there may be other places but that should be a start.

JimHewitt commented 3 years ago

@Phloot Actually, @mjsz is the one that found this. Not sure about TOS, but nflScrapR seems to be using the same endpoints. It remains to be seen if the PBP data for 2020 will be populated.

Phloot commented 3 years ago

@Phloot Actually, @mjsz is the one that found this. Not sure about TOS, but nflScrapR seems to be using the same endpoints. It remains to be seen if the PBP data for 2020 will be populated.

@JimHewitt Reading comprehension is difficult for me at near midnight, apologies :)

Phloot commented 3 years ago

Well, doesn't look like the gamecenter live-update page is displaying play by play data. That's a bummer.

scottismyname commented 3 years ago

I just tried a random game from 2019 and the PBP is there! Not sure if this will continue into 2020 or if the format is the same. I guess we'll know later this week.

Doesn't seem to be working for yesterday's Chief's game. Getting File not found."
Assuming I'm formatting it right? http://static.nfl.com/liveupdate/game-center/2020091000/2020091000_gtd.json

JimHewitt commented 3 years ago

I just tried a random game from 2019 and the PBP is there! Not sure if this will continue into 2020 or if the format is the same. I guess we'll know later this week.

Doesn't seem to be working for yesterday's Chief's game. Getting File not found." Assuming I'm formatting it right? http://static.nfl.com/liveupdate/game-center/2020091000/2020091000_gtd.json

Yup, Looks like we are SOL. Only hope is that they may not update the site until later, but I'm not holding my breath.

Coding-Kyle commented 3 years ago

nflscrapr has been replaced by nflfastR (https://mrcaseb.github.io/nflfastR/) and they were able to get the data for the HOU-KC Thursday night game (https://twitter.com/benbbaldwin/status/1304475824566013953) so there must be a way to do it

brbeaird commented 3 years ago

Yeah that's interesting. Trying to reverse-engineer how they did it. So far, looking through their code, it seems like this would be what they're trying to hit:

http://nflcdns.nfl.com/liveupdate/game-center/2020_01_HOU_KC/2020_01_HOU_KC_gtd.json

But that doesn't seem to work, either. Looks like I may need to get R installed so I can actually try and run it and verify.

JimHewitt commented 3 years ago

The json is here. Format is different, but has PBP info.

scottismyname commented 3 years ago

Yeah that's interesting. Trying to reverse-engineer how they did it. So far, looking through their code, it seems like this would be what they're trying to hit:

http://nflcdns.nfl.com/liveupdate/game-center/2020_01_HOU_KC/2020_01_HOU_KC_gtd.json

But that doesn't seem to work, either. Looks like I may need to get R installed so I can actually try and run it and verify.

Actually their code as I see it, seems to be trying to parse:

url <- glue::glue("http://nflcdns.nfl.com/liveupdate/game-center/{gameId}/{gameId}_gtd.json")

Where gameId is for example: # gameId = '2018090905'

Which would give the URL of: http://nflcdns.nfl.com/liveupdate/game-center/2020091000/2020091000_gtd.json

However, this doesn't work either. Curious that they got the JSON somehow, but it doesn't appear to be using this code to get it.

The code DOES work for previous years games, just not current year.

brbeaird commented 3 years ago

Yeah I tried it both ways. I wonder if the data is only active while the game is being played.

On Sat, Sep 12, 2020 at 8:38 PM Scott Kaforey notifications@github.com wrote:

Yeah that's interesting. Trying to reverse-engineer how they did it. So far, looking through their code, it seems like this would be what they're trying to hit:

http://nflcdns.nfl.com/liveupdate/game-center/2020_01_HOU_KC/2020_01_HOU_KC_gtd.json

But that doesn't seem to work, either. Looks like I may need to get R installed so I can actually try and run it and verify.

Actually their code as I see it, seems to be trying to parse:

url <- glue::glue(" http://nflcdns.nfl.com/liveupdate/game-center/{gameId}/{gameId}_gtd.json")

Where gameId is for example:

gameId = '2018090905'

Which would give the URL of:

http://nflcdns.nfl.com/liveupdate/game-center/2020091000/2020091000_gtd.json

However, this doesn't work either. Curious that they got the JSON somehow, but it doesn't appear to be using this code to get it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/derek-adair/nflgame/issues/130#issuecomment-691589185, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXHM2GCE25ZUIFMTEVNBRDSFQO77ANCNFSM4NAHM2XQ .

chitown88 commented 3 years ago

Couple things I’ve noticed, if you go here https://nflcdns.nfl.com/liveupdate/gamecenter/58167/KC_Gamebook.pdf the play by play is there, (you’d have to parse the pdf, and this isn’t live obviously), but the id 58167 isn’t the game id for the super bowl. So possibly a different id format needed for the gtd json data?

Secondly, it looks like he’s pulling the play-by-play from his own .rds files. Where ever in the code he’s saving/pushing those, might be able to see where he’s pulling it.

I can’t do too much digging into at the moment as I’m looking at all of this right now on my phone, but those are 2 things could be of possible interest.

BrianT71 commented 3 years ago

@chitown88 The id 58167 is called "gameKey" in the api under gameDetail (maybe other places too). It appears to be a sequential number as the KC-HOU gameKey from the Thur night game is 58168.

tomweingarten commented 3 years ago

If you're interested in the schedule for 2020, here's a quick script you can use to convert the nflfastR schedule into nflgame format:

import pyreadr

import pandas as pd
from datetime import datetime
import json

from pathlib import Path
home = str(Path.home())

schedule_json = json.load(open(f'{home}/.local/lib/python3.8/site-packages/nflgame/schedule.json', 'r'))
r_sched = pyreadr.read_r(f'{home}/Downloads/sched_2020.rds')
schedule_df = r_sched[None]

for index, row in schedule_df.iterrows():
    hours, minutes = row.gametime.split(':')
    if int(hours) >= 12:
        meridiem = 'PM'
        hours = str(int(hours) - 12)
    else:
        meridiem = 'AM'
    game = {
        "away": row.away_team,
        "day": int(row.gameday.split('-')[2]),
        "eid": row.old_game_id,
        "gamekey": "UNKNOWN",
        "home": row.home_team,
        "meridiem": meridiem,
        "month": int(row.gameday.split('-')[1]),
        "season_type": row.game_type,
        "time": f"{hours}:{minutes}",
        "wday": row.weekday,
        "week": int(row.week),
        "year": int(row.gameday.split('-')[0])
    }
    schedule_json['games'].append([row.old_game_id, game])

json.dump(schedule_json,
          open(f'{home}/.local/lib/python3.8/site-packages/nflgame/schedule.json.new', 'w')
         )
tomweingarten commented 3 years ago

The file 2020_01_HOU_KC.json.gz that nflFastR is using is the output from the NFL v3 Shield API. You can download this directly from the website when you view a page like this one (just filter the network calls by api.nfl.com)": https://www.nfl.com/games/texans-at-chiefs-2020-reg-1

My guess is someone is using an NFL API key or just manually downloading the json from the website and uploading it to github. It doesn't help us with live data, but for anyone interested in having historical data it wouldn't be hard to write an interpreter to convert these files and then add them in the old gamecenter format to this repo as well.

tomweingarten commented 3 years ago

Digging into the data format a bit more, it looks like (almost) everything we need is there to convert the new json format into the old json format, but there are two annoying problems:

  1. NFL is using a new player ID format. Does anyone know a way to connect the two ID types? For example, Aaron Rodgers used to be 00-0023459, now he's 32013030-2d30-3032-3334-35395dc60da5. It's probably easiest to just go back in time and replace the old files with the new ID type.
  2. The new format doesn't offer aggregate player statistics, so they'll need to be calculated.
james-childress commented 3 years ago

I think this may map between the two id types: https://github.com/guga31bb/nflfastR-data/issues/13#issuecomment-657285185

toddrob99 commented 3 years ago

Watch out for ids that change with each season...

JimHewitt commented 3 years ago

I think this may map between the two id types: guga31bb/nflfastR-data#13 (comment)

Also, looks lime the old id is contained in the new one as ASCII. 32013030-2d30-3032-3334-35395dc60da5

tomweingarten commented 3 years ago

Good catch! With that in mind, here's a (very) rough start. I'm not proud of how this code looks :) I don't have much time this week so if anyone can help out it'd be very appreciated. It needs a lot of cleaning up on the drives and some logic to aggregate the statistics.

It takes as input the json files found here: https://github.com/guga31bb/nflfastR-raw/tree/master/raw/2020

import json
from pprint import pprint
from pathlib import Path
home = str(Path.home())
import codecs

def convert_gsis_id(new_id):
    return codecs.decode(new_id[4:-8].replace('-',''),"hex").decode('utf-8')

result = {}
drives = {}
for i, drive in enumerate(file['data']['viewer']['gameDetail']['drives']):
    drives[str(i)] = {
        'posteam': 'SF',
        'qtr': 1,
        'redzone': True,
        'fds': 0,
        'result': 'Punt',
        'penyds': 0,
        'ydsgained': 0,
        'numplays': 0,
        'postime': '1:55',
        "start": {
          "qtr": 1,
          "time": "15:00",
          "yrdln": "SF 25",
          "team": "SF"
        },
        "end": {
          "qtr": 1,
          "time": "13:05",
          "yrdln": "SF 34",
          "team": "SF"
        },
        'plays': {}
      }

for play in file['data']['viewer']['gameDetail']['plays'][1:]:
    new_play = {
        "sp": 0,
        "qtr": 1,
        "down": 1,
        "time": "13:05",
        "yrdln": "GB 25",
        "ydstogo": 10,
        "ydsnet": 25,
        "posteam": "GB",
        "desc": "(13:05) A.Jones right end to GB 34 for 9 yards (J.Tartt).",
        "note": None,
        "players": {}
      }
    for sequence, stat in enumerate(play['playStats']):
        if 'playerName' in stat:
            player_id = convert_gsis_id(stat['gsisPlayer.id'])
            if not player_id in new_play['players']:
                new_play['players'][player_id] = []
            new_play['players'][player_id].append({
                "sequence": sequence,
                "clubcode": stat['team.abbreviation'],
                "playerName": stat['playerName'],
                "statId": stat['statId'],
                "yards": stat['yards'],
            })
        else:
            new_play['players'][0] = [{
                "sequence": sequence,
                "clubcode": stat['team.abbreviation'],
                "playerName": None,
                "statId": stat['statId'],
                "yards": stat['yards'],
            }]
    drives[str(play['driveSequenceNumber']-1)]['plays'][play['playId']] = new_play

result['drives'] = drives
datestring = '20200913'
seq_id = 0 # TODO: Order the games each week and assign accordingly
game_id = '{datestring}{seq_id}'.format(
    datestring = datestring,
    seq_id = str(seq_id).zfill(2)
)
result = {
    game_id: result
}
filename = f'{home}/.local/lib/python3.8/site-packages/nflgame/gamecenter-json/{game_id}.json'
json.dump(result, open(filename, 'w'))
tomweingarten commented 3 years ago

This works well enough now that the files are importable into nflgame. You can get the full play-by-play for each game but not the game stats or the player list.

import json
from pprint import pprint
from pathlib import Path
home = str(Path.home())
import codecs

def convert_gsis_id(new_id):
    # 32013030-2d30-3032-3334-35395dc60da5
    # XXXX3030-2d30-3032-3334-3539XXXXXXXX
    # '00-0023459'
    return codecs.decode(new_id[4:-8].replace('-',''),"hex").decode('utf-8')

def parseTeam(gameDetail, team, old_team, result):
    if gameDetail[f'{team}PointsOvertime']:
        overtime =  gameDetail[f'{team}PointsOvertime'][0]
    else:
        overtime = 0
    result[f'{old_team}'] = {
        'score': {
            "1": gameDetail[f'{team}PointsQ1'][0],
            "2": gameDetail[f'{team}PointsQ2'][0],
            "3": gameDetail[f'{team}PointsQ3'][0],
            "4": gameDetail[f'{team}PointsQ4'][0],
            "5": overtime,
            "T": gameDetail[f'{team}PointsTotal'][0],
          },
        'abbr': gameDetail[f'{team}Team'][f'abbreviation'][0],
        'stats': {
            'team': {
              "totfd": 0,
              "totyds": 0,
              "pyds": 0,
              "ryds": 0,
              "pen": 0,
              "penyds": 0,
              "trnovr": 0,
              "pt": 0,
              "ptyds": 0,
              "ptavg": 0,
              "top": "00:00"
            },
        },
    }

def convert_game_file(gameDetail, eid):
    result = {
        # Assume the game is always over
        "weather": None,
        "media": None,
        "yl": "",
        "qtr": "Final",
        "note": None,
        "down": 0,
        "togo": 0,
        "redzone": True,
        "clock": "00:00",
        "posteam": None, # Doesn't matter
        "stadium": None,
        "scrsummary": {}}

    parseTeam(gameDetail,'home', 'home', result)
    parseTeam(gameDetail,'visitor', 'away', result)
    drives = {}
    for i, drive in enumerate(gameDetail['drives']):
        drives[str(i)] = {
            'posteam': 'SF',
            'qtr': 1,
            'redzone': True,
            'fds': 0,
            'result': 'Punt',
            'penyds': 0,
            'ydsgained': 0,
            'numplays': 0,
            'postime': '1:55',
            "start": {
              "qtr": 1,
              "time": "15:00",
              "yrdln": "SF 25",
              "team": "SF"
            },
            "end": {
              "qtr": 1,
              "time": "13:05",
              "yrdln": "SF 34",
              "team": "SF"
            },
            'plays': {}
          }

    for play in file['data']['viewer']['gameDetail']['plays'][1:]:
        try:
            new_play = {
                "sp": 0,
                "qtr": play['quarter'],
                "down": 1,
                "time": "13:05",
                "yrdln": play['yardLine'],
                "ydstogo": play['yardsToGo'],
                "ydsnet": play['driveNetYards'] if 'driveNetYards' in play else 0,
                "posteam": play['possessionTeam.abbreviation'] if 'possessionTeam.abbreviation' in play else '',
                "desc": play['playDescription'],
                "note": None,
                "players": {}
              }
            for sequence, stat in enumerate(play['playStats']):
                if 'playerName' in stat:
                    player_id = convert_gsis_id(stat['gsisPlayer.id'])
                    if not player_id in new_play['players']:
                        new_play['players'][player_id] = []
                    new_play['players'][player_id].append({
                        "sequence": sequence,
                        "clubcode": stat['team.abbreviation'],
                        "playerName": stat['playerName'],
                        "statId": stat['statId'],
                        "yards": stat['yards'],
                    })
                else:
                    new_play['players'][0] = [{
                        "sequence": sequence,
                        "clubcode": stat['team.abbreviation'],
                        "playerName": None,
                        "statId": stat['statId'],
                        "yards": stat['yards'],
                    }]
            drives[str(play['driveSequenceNumber']-1)]['plays'][play['playId']] = new_play
        except Exception as e:
            print(e)
            print(e.args)
            print(play)

    result['drives'] = drives
    result = {
        eid: result
    }
    filename = f'{home_dir}/.local/lib/python3.8/site-packages/nflgame/gamecenter-json/{eid}.json'
    json.dump(result, open(filename, 'w'))
    !gzip -f {filename}

schedule = json.load(open(f'{home}/.local/lib/python3.8/site-packages/nflgame/schedule.json'))
for game in schedule['games']:
    year = game[1]['year']
    home = game[1]['home']
    away = game[1]['away']
    week = game[1]['week']
    if year == 2020 and week == 1:
        print(game)
        file = json.load(open(f'nflfastR-raw/raw/{year}/{year}_{str(week).zfill(2)}_{away}_{home}.json', 'r'))
        gameDetail = file['data']['viewer']['gameDetail']
        convert_game_file(gameDetail, game[0])