Open andr3w321 opened 9 years ago
Are there are any scripts already written to get data before 2009?
Not that I'm aware of.
I see that data at http://www.nfl.com/ajax/scorestrip?season=2014&seasonType=REG&week=1 goes back to 1970 for regular season.
This is only the schedule.
Is there an api doc page for nfl.com somewhere?
NFL.com has no API. It has an undocumented JSON feed that goes back to 2009. I'm not aware of any documentation, although this part of nflgame
maps statistical category identifiers to names.
Update: I see you can match the gsis id with gamebook data here: http://www.nflgsis.com/1999/Reg/15/796/Gamebook.xml
This contains very little structured data. Notably, the plays are encoded as human readable descriptions only. Parsing them to get the granularity of detail expected by nfldb
is hard. (But I've wanted to try for a long time.)
In a somewhat unrelated note is it normal for nflgame-update-players to require sudo?
Only if you've installed nflgame
to a system directory. It updates a JSON file that is stored inside the package directory.
I installed with pip install nfldb --user and I get I do not have write access to "/usr/local/lib/python2.7/dist-packages/nflgame/players.json". when I try running it without.
Not sure. On my system, --user
installs to my home directory. You might try a virtual environment:
$ virtualenv nfl
$ source nfl/bin/activate
$ pip install nfldb
$ nflgame-update-players
Thanks for the reply and great project. I initially mistakenly installed the project with sudo pip install nfldb
but when I removed it with sudo pip uninstall nfldb
the lib files were not removed. After manually deleting the files and reinstalling locally it's working fine without sudo
The NFL.com json feed has a bit more than schedule. It lists scores too. I'll write up something to download and import them.
You shouldn't have to write anything. I think nflgame
can handle them: https://github.com/BurntSushi/nflgame/blob/master/nflgame/update_sched.py
Wish I'd read this earlier. Just got done downloading them all. I'll post the download script and link to tarball of them anyways. The script did skip a few that I had to download manually later. I'm assuming due to server errors so I wouldn't 100% trust it to get them all with out double checking later.
https://drive.google.com/file/d/0BwoaqcZEOIvoLXpuY3FCelVKV2M/view?usp=sharing
import urllib, os
import time
download_folder = "nfl-schedules"
root_url = "http://www.nfl.com/ajax/scorestrip?"
if not os.path.exists(download_folder):
os.makedirs(download_folder)
def download_xml(season, week, season_type):
# download xml file schedule from NFL.com
url = root_url + "season=" + str(season) + "&seasonType=" + season_type + "&week=" + str(week)
filename = download_folder + "/" + str(season) + "-" + str(week) + "-" + season_type + ".xml"
urllib.urlretrieve(url, filename)
# delete file if it contains no games
file = open(filename, 'r')
if not "gms" in file.read():
os.remove(filename)
file.close()
# be nice and sleep 2 seconds between requests
time.sleep(2)
for season in range(1970,2015):
season_type = "REG"
for week in range(1,18):
download_xml(season, week, season_type)
season_type = "PRE"
for week in range(1,5):
download_xml(season, week, season_type)
season_type = "POST"
for week in range(15,23):
download_xml(season, week, season_type)
I was able to get the data into the schedule.json file, but I'm having a hard time figuring out how to get it in the postgres db? Where is the code that imports the json file? I created a modified update_sched.py file here https://github.com/andr3w321/nflgame/blob/master/nflgame/update_sched.py with a function to import the folder I linked to above python update_sched.py --build-old ../../nfl-schedules
I added home_score
and away_score
to lines 106 and 107 to the json so I'm not sure where else to update these field names.
Without some help I think I'm going to give up attempting to add a pull request and just create a csv and work directly with sql to get the data in the db.
I edited the file https://github.com/andr3w321/nflgame/blob/master/nflgame/update_sched.py to create the new function that successfully adds the schedules for the old games after running python update_sched.py --build-old ../../nfl-schedules
and then ~/.local/bin/nfldb-update
However, there's a few issues
['BOS', 'Boston', 'Patriots'], ['LA', 'Los Angeles', 'RaidersorRams'], ['PHO', 'Phoenix', 'Cardinals'], ['RAI', 'LAorOakland', 'Raiders'], ['RAM', 'LAorSTL', 'Rams']
I'm not sure of the fields with or in them obviously. I tried adding these teams to https://github.com/BurntSushi/nfldb/blob/master/nfldb/team.py but when I do and run nfldb-update it errors out with psycopg2.IntegrityError: insert or update on table "game" violates foreign key constraint "game_home_team_fkey" DETAIL: Key (home_team)=(LA) is not present in table "team".
As of now it works okay, just every time the xml file references BOS, LA, PHO, RAI, or RAM it will input UNK in the home or away team DB entry.I uploaded my schedule.json file here: https://drive.google.com/file/d/0BwoaqcZEOIvoeXdHRDV0SlJPbWc/view?usp=sharing
is slow with a batch size of 5 games when adding 9,625 new games.
Increase the batch size? The batch size should be increased when doing large bulk loads. It is kept small by default since most updates are small and incremental. Larger batch sizes require more memory.
I'm not sure of the fields with or in them obviously. I tried adding these teams to https://github.com/BurntSushi/nfldb/blob/master/nfldb/team.py but when I do and run nfldb-update it errors out with psycopg2.IntegrityError: insert or update on table "game" violates foreign key constraint "game_home_team_fkey" DETAIL: Key (home_team)=(LA) is not present in table "team". As of now it works okay, just every time the xml file references BOS, LA, PHO, RAI, or RAM it will input UNK in the home or away team DB entry.
A foreign key constraint means the teams aren't in the DB. You'll probably need to add a migration in nfldb/db.py
.
My goodness am I glad we have FK constraints. They've prevented innumerable bugs from ever happening!
I still can't figure out how to get the scores into the database.
Not sure.
Can you submit a PR anyway? I might be able to just build on what you've done and finish it. (It might be a little though.)
Thanks for your help. I wrote some code to look at the team names and mascots. The only city name in question is RAI, which looks like it refers to the Los Angeles Raiders http://www.pro-football-reference.com/teams/rai/ The problem with just adding the unknown teams listed above is it will describe the 1983 Baltimore Ravens which were actually the Colts so to get it really done correctly would require adding the below teams.txt table to the db. I submitted a pull request with the update_sched.py --build-old method I added.
print_teams_table.py
import xml.dom.minidom as xml
import os
import re
nfl_schedules_path = './nfl-schedules'
def sort_nicely( l ):
""" Sort the given list in the way that humans expect.
"""
convert = lambda text: int(text) if text.isdigit() else text
alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ]
l.sort( key=alphanum_key )
def get_filenames(dir_path, starts_with, ends_with):
"""
Returns a list of all filenames in a dir
that starts with a given string
and ends with a given string
eg: get_filenames("./nfl-schedules", "1970", ".xml")
"""
filenames = []
try:
for file in os.listdir(dir_path):
if file.startswith(starts_with) and file.endswith(ends_with):
filenames.append(file)
except OSError:
sys.stderr.write("could not load %s\n" % dir_path)
return filenames
def add_sym_mascot_year(sym, mascot, year, teams):
team_idx = -1
for i in range(0, len(teams)):
if teams[i][0] == sym and teams[i][1] == mascot:
team_idx = i
break
if team_idx == -1:
teams.append([sym.encode('utf-8'), mascot.encode('utf-8'), [year]])
else:
if year not in teams[team_idx][2]:
teams[team_idx][2].append(year)
return teams
xml_filenames = get_filenames(nfl_schedules_path, "", ".xml")
sort_nicely(xml_filenames)
xml_filenames.reverse()
teams = []
for xml_file in xml_filenames:
year,week,stype = xml_file.split(".xml")[0].split("-")
year = int(year)
week = int(week)
xml_filename = nfl_schedules_path + "/" + str(year) + "-" + str(week) + "-" + stype + ".xml"
try:
dom = xml.parse(open(xml_filename))
except IOError:
sys.stderr.write("could not load %s\n" % xml_filename)
for g in dom.getElementsByTagName("g"):
home_sym = g.getAttribute('h')
home_mas = g.getAttribute('hnn')
away_sym = g.getAttribute('v')
away_mas = g.getAttribute('vnn')
teams = add_sym_mascot_year(home_sym, home_mas, year, teams)
teams = add_sym_mascot_year(away_sym, away_mas, year, teams)
teams.sort()
for team in teams:
print team
python print_teams_table.py > teams.txt
['ARI', 'cardinals', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994]]
['ATL', 'falcons', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['BAL', 'colts', [1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['BAL', 'ravens', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996]]
['BOS', 'patriots', [1970]]
['BUF', 'bills', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['CAR', 'panthers', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995]]
['CHI', 'bears', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['CIN', 'bengals', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['CLE', 'browns', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['DAL', 'cowboys', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['DEN', 'broncos', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['DET', 'lions', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['GB', 'packers', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['HOU', 'oilers', [1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['HOU', 'texans', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002]]
['IND', 'colts', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984]]
['JAC', 'jaguars', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995]]
['KC', 'chiefs', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['LA', 'rams', [1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['MIA', 'dolphins', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['MIN', 'vikings', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['NE', 'patriots', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971]]
['NO', 'saints', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['NYG', 'giants', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['NYJ', 'jets', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['OAK', 'raiders', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['PHI', 'eagles', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['PHO', 'cardinals', [1993, 1992, 1991, 1990, 1989, 1988]]
['PIT', 'steelers', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['RAI', 'raiders', [1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982]]
['RAM', 'rams', [1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982]]
['SD', 'chargers', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['SEA', 'seahawks', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976]]
['SF', '49ers', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['STL', 'cardinals', [1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
['STL', 'rams', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995]]
['TB', 'buccaneers', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976]]
['TEN', 'oilers', [1998, 1997]]
['TEN', 'titans', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999]]
['WAS', 'redskins', [2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986, 1985, 1984, 1983, 1982, 1981, 1980, 1979, 1978, 1977, 1976, 1975, 1974, 1973, 1972, 1971, 1970]]
For a quick migration ignoring years this should work
INSERT INTO team (team_id, city, name) VALUES
('BOS', 'Boston', 'Patriots'),
('LA', 'Los Angeles', 'Rams'),
('PHO', 'Phoenix', 'Cardinals'),
('RAI', 'Los Angeles', 'Raiders'),
('RAM', 'Los Angeles', 'Rams');
Don't forget to add the same to https://github.com/BurntSushi/nfldb/blob/master/nfldb/team.py
Okay I finally got the scores loaded in. It's far from an elegant solution, but it works. Some mascot names will still be incorrect. I added a folder with my scripts here: https://github.com/andr3w321/nflgame/tree/master/build-old-scripts The steps to get all the old game scores in the database are
psql -U nfldb nfldb < add-old-team-names.sql
and add the teams to your team.py file. Mine is located at ~/.local/lib/python2.7/site-packages/nfldb/team.py
python update_sched.py --build-old ../build-old-scripts/nfl-schedules
/.local/lib/python2.7/site-packages/nfldb/update.py
nfldb-update
Mine is located at ~/.local/bin/nfldb-update
psql -U nfldb nfldb < nfl-update-old-scores.sql
python print-avg-ppg-per-year.py
You should get the following output:Year: 1970 # of Games: 182 Avg Points Per Game: 38.52
Year: 1971 # of Games: 182 Avg Points Per Game: 38.73
Year: 1972 # of Games: 182 Avg Points Per Game: 40.51
Year: 1973 # of Games: 182 Avg Points Per Game: 38.91
Year: 1974 # of Games: 182 Avg Points Per Game: 36.36
Year: 1975 # of Games: 182 Avg Points Per Game: 41.18
Year: 1976 # of Games: 196 Avg Points Per Game: 38.31
Year: 1977 # of Games: 196 Avg Points Per Game: 34.35
Year: 1978 # of Games: 224 Avg Points Per Game: 36.67
Year: 1979 # of Games: 224 Avg Points Per Game: 40.13
Year: 1980 # of Games: 224 Avg Points Per Game: 40.97
Year: 1981 # of Games: 224 Avg Points Per Game: 41.35
Year: 1982 # of Games: 98 Avg Points Per Game: 41.18
Year: 1983 # of Games: 224 Avg Points Per Game: 43.66
Year: 1984 # of Games: 224 Avg Points Per Game: 42.42
Year: 1985 # of Games: 224 Avg Points Per Game: 43.06
Year: 1986 # of Games: 224 Avg Points Per Game: 41.04
Year: 1987 # of Games: 210 Avg Points Per Game: 43.20
Year: 1988 # of Games: 224 Avg Points Per Game: 40.51
Year: 1989 # of Games: 224 Avg Points Per Game: 41.21
Year: 1990 # of Games: 224 Avg Points Per Game: 40.25
Year: 1991 # of Games: 224 Avg Points Per Game: 37.97
Year: 1992 # of Games: 224 Avg Points Per Game: 37.46
Year: 1993 # of Games: 210 Avg Points Per Game: 37.09
Year: 1994 # of Games: 224 Avg Points Per Game: 40.51
Year: 1995 # of Games: 240 Avg Points Per Game: 42.98
Year: 1996 # of Games: 240 Avg Points Per Game: 40.85
Year: 1997 # of Games: 240 Avg Points Per Game: 41.49
Year: 1998 # of Games: 240 Avg Points Per Game: 42.56
Year: 1999 # of Games: 248 Avg Points Per Game: 41.63
Year: 2000 # of Games: 248 Avg Points Per Game: 41.35
Year: 2001 # of Games: 248 Avg Points Per Game: 40.42
Year: 2002 # of Games: 256 Avg Points Per Game: 43.35
Year: 2003 # of Games: 256 Avg Points Per Game: 41.66
Year: 2004 # of Games: 256 Avg Points Per Game: 42.97
Year: 2005 # of Games: 256 Avg Points Per Game: 41.23
Year: 2006 # of Games: 256 Avg Points Per Game: 41.32
Year: 2007 # of Games: 256 Avg Points Per Game: 43.38
Year: 2008 # of Games: 256 Avg Points Per Game: 44.06
Year: 2009 # of Games: 256 Avg Points Per Game: 42.93
Year: 2010 # of Games: 256 Avg Points Per Game: 44.07
Year: 2011 # of Games: 256 Avg Points Per Game: 44.36
Year: 2012 # of Games: 256 Avg Points Per Game: 45.51
Year: 2013 # of Games: 256 Avg Points Per Game: 46.82
Year: 2014 # of Games: 256 Avg Points Per Game: 45.18
FYI for anyone who does this I would recommend doing on a new database or after you do it renaming nfldb as it kind of breaks nfldb-update
which will try to update all 9000+ games and takes much longer than usual.
Hey, I'm working on a project with NFL stats at http://statstrac.com/. Just wanted to leave a comment that if you do import old score data and don't want to rename or make a new nfldb, just revert your update_sched.py
back to the original, then rebuild schedule.json
with python update_sched.py --rebuild
. Finally, edit update games_scheduled
in site-packages/nfldb/update.py
to: (adding select for game.season_year and ignoring if older than 2009)
def games_scheduled(cursor):
"""
Returns a list of GSIS identifiers corresponding to games that
have schedule data in the database but don't have any drives or
plays in the database. In the typical case, this corresponds to
games that haven't started yet.
The list is sorted in the order in which the games will be played.
"""
scheduled = []
cursor.execute('''
SELECT DISTINCT game.gsis_id, game.start_time, game.season_year
FROM game
LEFT JOIN drive
ON game.gsis_id = drive.gsis_id
WHERE drive.drive_id IS NULL
''')
for row in cursor.fetchall():
# This condition guards against unnecessarily processing games
# that have only schedule data but aren't even close to starting yet.
# Namely, if a game doesn't have any drives, then there's nothing to
# bulk insert.
#
# We start looking at games when it's 15 minutes before game time.
# Eventually, the game will start, and the first bits of drive/play
# data will be bulk inserted. On the next database update, the game
# will move to the `games_in_progress` list and updated incrementally.
#
# So what's the point of bulk inserting? It's useful when updates are
# done infrequently (like the initial load of the database or say,
# once a week).
if seconds_delta(row['start_time'] - nfldb.now()) < 900 and row['season_year'] >= 2009:
scheduled.append(row['gsis_id'])
return sorted(scheduled, key=int)
and that should do it :) Now you can safely run nfldb-update
without worrying about an insanely long execution time and all the imported scores/legacy team names being overwritten. It's just nice to have everything in one DB
Thanks for the fix. The only problem I found is that whenever I upgrade nfldb update.py
gets overritten so I wrote a shell script that I run after running pip installl nfldb --upgrade --user
that automatically makes the changes.
#!/bin/bash
sed -i "s/SELECT\ DISTINCT\ game.gsis_id,\ game.start_time$/SELECT\ DISTINCT\ game.gsis_id,\ game.start_time,\ game.season_year/g" ~/.local/lib/python2.7/site-packages/nfldb/update.py
sed -i "s/if\ seconds_delta(row\['start_time']\ -\ nfldb.now())\ <\ 900:/if\ seconds_delta(row\['start_time']\ -\ nfldb.now())\ <\ 900\ and\ row\['season_year']\ >=\ 2009:/g" ~/.local/lib/python2.7/site-packages/nfldb/update.py
If you want data before 2009, you can purchase it from armchairanalysis for $49 - they go back to 2000. It looks like the owner used VBA to parse the HTML pages from the pre-gamecenter JSON days.
Are there are any scripts already written to get data before 2009? I see that data at http://www.nfl.com/ajax/scorestrip?season=2014&seasonType=REG&week=1 goes back to 1970 for regular season. Is there an api doc page for nfl.com somewhere? This only gives gsis_ids, datetimes and scores. I'd be interested in more stats even if they're not complete drive by drive stats.
Update: I see you can match the gsis id with gamebook data here: http://www.nflgsis.com/1999/Reg/15/796/Gamebook.xml
In a somewhat unrelated note is it normal for
nflgame-update-players
to requiresudo
? I installed withpip install nfldb --user
and I getI do not have write access to "/usr/local/lib/python2.7/dist-packages/nflgame/players.json".
when I try running it without.