BurntSushi / nflgame

An API to retrieve and read NFL Game Center JSON data. It can work with real-time data, which can be used for fantasy football.
http://pdoc.burntsushi.net/nflgame
The Unlicense
1.27k stars 413 forks source link

Compiling Detailed Play by Play Data #251

Open stermi01 opened 8 years ago

stermi01 commented 8 years ago

Hey guys,

So I've tried to navigate the raw JSON data but I had a load of trouble with formatting with what I was using.

For me the easiest way of navigating the data is if it's in CSV format. So I want to get each play in a game as well as detailed stats at the time of the play like score/players/yards on the play/ect.

Is there an easier way of doing this apart from going through the directories to find a specific field?

Thanks in advance!

ochawkeye commented 8 years ago

:astonished: Not sure what you mean by navigating the JSON data, but there are absolutely better ways to do this! You shouldn't ever have to manually go into the JSON files. Everything in those files should be accessible from nflgame. What exactly have you tried?

import nflgame

games = nflgame.games(year=2016, week=6, kind='REG', home='ARI')
plays = nflgame.combine_plays(games)

for play in plays:
    print play
    for player in play.players:
        print '\t', player, player.formatted_stats()
...
(ARI, ARI 38, Q4, 2 and 10) (3:39) D.Stanton pass short middle to Ja.Brown to 50 for 12 yards (M.Gilchrist). ARI-Ja.Brown was injured during the play.
    M.Gilchrist defense_tkl: 1
    Ja.Brown receiving_yds: 12, receiving_rec: 1, receiving_tar: 1, receiving_yac_yds: 0
    D.Stanton passing_att: 1, passing_yds: 12, passing_cmp: 1, passing_cmp_air_yds: 12
(ARI, 50, Q4, 1 and 10) (3:00) A.Ellington up the middle to NYJ 37 for 13 yards (M.Gilchrist).
    M.Gilchrist defense_tkl: 1
    A.Ellington rushing_att: 1, rushing_yds: 13
(ARI, NYJ 37, Q4, 1 and 10) (2:11) A.Ellington up the middle to NYJ 37 for no gain (L.Williams).
    L.Williams defense_tkl: 1
    A.Ellington rushing_att: 1, rushing_yds: 0
Two-Minute Warning
(ARI, NYJ 37, Q4, 2 and 10) (2:00) S.Taylor up the middle to NYJ 35 for 2 yards (S.Richardson).
    S.Richardson defense_tkl: 1
    S.Taylor rushing_att: 1, rushing_yds: 2
(ARI, NYJ 35, Q4, 3 and 8) (1:15) S.Taylor up the middle to NYJ 23 for 12 yards (Jo.Jenkins).
    Jo.Jenkins defense_tkl: 1
    S.Taylor rushing_att: 1, rushing_yds: 12
(ARI, NYJ 23, Q4, 1 and 10) (:34) D.Stanton kneels to NYJ 24 for -1 yards.
    D.Stanton rushing_att: 1, rushing_yds: -1
END GAME
stermi01 commented 8 years ago

@ochawkeye

For a bit of background, I'm trying to put it on an HPCC cluster to do some real fun stuff, but I had a lot of trouble reading the JSON file I pulled from the raw data. It's just so much easier to format it as CSV and use it that way. This is what I've been trying to do so far, it's somewhat successful but I'm sure theres a much easier way.

import nflgame
import csv

year = 2015
stryear = '2015'

games = nflgame.games(year)
plays = nflgame.combine_plays(games)

outputFile = open('season' + stryear + 'plays.txt', 'w')
outputFile.write("week,year,meridiem,month,time,wday,day,play\n")

for g in games:
    for schedule in g.schedule:
        week = g.schedule['week']
        year = g.schedule['year']
        meridiem = g.schedule['meridiem']
        month = g.schedule['month']
        time = g.schedule['time']
        wday = g.schedule['wday']
        day = g.schedule['day']

        finalstringschedule = str(week) + ',' + str(year) + ',' + str(meridiem) + ',' + str(month) + ',' + str(time) + ',' + str(wday) + ',' + str(day)
        break
    for drive in g.drives:
        for play in drive.plays:
            theplay = str(play)
            outputFile.write(finalstringschedule+",")
            outputFile.write(theplay)
            outputFile.write('\n')

outputFile.close()

I'm trying to get something similar to the play by play data on http://nflsavant.com/about.php so that I can constantly keep my data up to date. Plus what this package has in terms of information is so much more comprehensive, which is why I'm trying to figure it out.

ochawkeye commented 8 years ago

I think I misunderstood what you meant by "going through the directories to find a specific field?"

Are you just looking for this API documentation?

stermi01 commented 8 years ago

@ochawkeye I've been looking through that this morning but I still haven't found something that helps me get something along the lines of this

image

stermi01 commented 8 years ago

For those interested in the future, this was what I've accomplished so far:

import nflgame
import csv

year = 2009

while year != 2016:
    games = nflgame.games(year)
    plays = nflgame.combine_plays(games)

    outputFile = open('allseasonsplays.txt', 'a')
    outputFile.write("week;year;month;time;wday;day;note;quarter;offenseteam;passing_incmp;passing_incmp_air_yds;passing_cmp;passing_cmp_air_yds;passing_first_down;passing_yds;passing_att;punting_tot;punting_yds;rushing_att;rushing_loss;rushing_loss_yds;rushing_yds;receiving_rec;receiving_yac_yds;receiving_yds;receiving_tar;kicking_xpa;kicking_xpmade;kicking_touchback;kicking_yds;player1;player2;player3;player4;player5;player6;player7;player8;player9;yardline;yardstogo;down;istouchdown;isrushingatt;ispassingatt;playerid1;playerid2;playerid3;playerid4;playerid5;playerid6;playerid7;playerid8;playerid9;play\n")

    for g in games:
        for schedule in g.schedule:
            week = g.schedule['week']
            year = g.schedule['year']
            #meridiem = g.schedule['meridiem']
            month = g.schedule['month']
            time = g.schedule['time']
            wday = g.schedule['wday']
            day = g.schedule['day']

            finalstringschedule = str(week) + ';' + str(year) + ';' + str(month) + ';' + str(time) + ';' + str(wday) + ';' + str(day)
            break
        for drive in g.drives:
            for play in drive.plays:
                cnt = 9
                for player in play.players:
                    playername = str(player)
                    playernames = playernames + playername + ';'
                    playerid = player.playerid
                    finalplayer = finalplayer + playerid + ';'
                    cnt = cnt - 1
                while cnt != 0:
                    finalplayer = str(finalplayer) + ';'
                    playernames = str(playernames) + ';'
                    cnt = cnt - 1
                    if cnt < 0:
                        break
                theplay = str(play)
                note = str(play.note)
                players = str(play.players)
                down = str(play.down)
                yardline = str(play.yardline)
                yards_togo = str(play.yards_togo)
                istouchdown = str(play.touchdown)
                offenseteam = str(play.team)
                ispassingatt = str(play.passing_att)
                isrushingatt = str(play.rushing_att)
                quarter = str(play.time)
                passing_incmp = str(play.passing_incmp)
                passing_incmp_air_yds = str(play.passing_incmp_air_yds)
                passing_cmp = str(play.passing_cmp)
                passing_cmp_air_yds = str(play.passing_cmp_air_yds)
                passing_first_down = str(play.passing_first_down)
                passing_yds = str(play.passing_yds)
                passing_att = str(play.passing_att)
                punting_tot = str(play.punting_tot)
                punting_yds = str(play.punting_yds)
                rushing_att = str(play.rushing_att)
                rushing_loss = str(play.rushing_loss)
                rushing_loss_yds = str(play.rushing_loss_yds)
                rushing_yds = str(play.rushing_yds)
                receiving_rec = str(play.receiving_rec)
                receiving_yac_yds = str(play.receiving_yac_yds)
                receiving_yds = str(play.receiving_yds)
                receiving_tar = str(play.receiving_tar)
                kicking_xpa = str(play.kicking_xpa)
                kicking_xpmade = str(play.kicking_xpmade)
                kicking_touchback = str(play.kicking_touchback)
                kicking_yds = str(play.kicking_yds)
                outputFile.write(finalstringschedule+";")
                outputFile.write(note+';')
                outputFile.write(quarter + ';')
                outputFile.write(offenseteam+';')
                outputFile.write(passing_incmp+ ';')
                outputFile.write(passing_incmp_air_yds+ ';')
                outputFile.write(passing_cmp+ ';')
                outputFile.write(passing_cmp_air_yds+ ';')
                outputFile.write(passing_first_down+ ';')
                outputFile.write(passing_yds+ ';')
                outputFile.write(passing_att+ ';')
                outputFile.write(punting_tot+ ';')
                outputFile.write(punting_yds+ ';')
                outputFile.write(rushing_att+ ';')
                outputFile.write(rushing_loss+ ';')
                outputFile.write(rushing_loss_yds+ ';')
                outputFile.write(rushing_yds+ ';')
                outputFile.write(receiving_rec+ ';')
                outputFile.write(receiving_yac_yds+ ';')
                outputFile.write(receiving_yds+ ';')
                outputFile.write(receiving_tar+ ';')
                outputFile.write(kicking_xpa+ ';')
                outputFile.write(kicking_xpmade+ ';')
                outputFile.write(kicking_touchback+ ';')
                outputFile.write(kicking_yds+ ';')
                outputFile.write(playernames)
                outputFile.write(yardline+';')
                outputFile.write(yards_togo+';')
                outputFile.write(down + ';')
                outputFile.write(istouchdown+';')
                outputFile.write(ispassingatt+ ';')
                outputFile.write(isrushingatt+ ';')
                outputFile.write(finalplayer)
                outputFile.write(theplay)
                outputFile.write('\n')
                finalplayer = ''
                playernames = ''

    year = year + 1

outputFile.close()

This will output the following:

week;year;month;time;wday;day;note;quarter;offenseteam;passing_incmp;passing_incmp_air_yds;passing_cmp;passing_cmp_air_yds;passing_first_down;passing_yds;passing_att;punting_tot;punting_yds;rushing_att;rushing_loss;rushing_loss_yds;rushing_yds;receiving_rec;receiving_yac_yds;receiving_yds;receiving_tar;kicking_xpa;kicking_xpmade;kicking_touchback;kicking_yds;player1;player2;player3;player4;player5;player6;player7;player8;player9;yardline;yardstogo;down;istouchdown;isrushingatt;ispassingatt;playerid1;playerid2;playerid3;playerid4;playerid5;playerid6;playerid7;playerid8;playerid9;play
1;2009;9;8:30;Thu;10;KICKOFF;Q1 15:00;TEN;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;67;M.Griffin;R.Bironas;S.Logan;;;;;;;OWN 30;0;0;False;0;0;00-0025406;00-0020962;00-0026491;;;;;;;(TEN, TEN 30, Q1) R.Bironas kicks 67 yards from TEN 30 to PIT 3. S.Logan to PIT 42 for 39 yards (M.Griffin).
1;2009;9;8:30;Thu;10;None;Q1 14:53;PIT;0;0;1;-3;0;5;1;0;0;0;0;0;0;1;8;5;1;0;0;0;0;C.Hope;H.Ward;B.Roethlisberger;;;;;;;OWN 42;10;1;False;1;0;00-0021219;00-0017162;00-0022924;;;;;;;(PIT, PIT 42, Q1, 1 and 10) (14:53) B.Roethlisberger pass short left to H.Ward to PIT 47 for 5 yards (C.Hope).
1;2009;9;8:30;Thu;10;None;Q1 14:16;PIT;0;0;0;0;0;0;0;0;0;1;1;-3;-3;0;0;0;0;0;0;0;0;S.Tulloch;W.Parker;;;;;;;;OWN 47;5;2;False;0;1;00-0024331;00-0022250;;;;;;;;(PIT, PIT 47, Q1, 2 and 5) (14:16) W.Parker right end to PIT 44 for -3 yards (S.Tulloch).
1;2009;9;8:30;Thu;10;None;Q1 13:35;PIT;1;34;0;0;0;0;1;0;0;0;0;0;0;0;0;0;1;0;0;0;0;B.Roethlisberger;M.Wallace;;;;;;;;OWN 44;8;3;False;1;0;00-0022924;00-0026901;;;;;;;;(PIT, PIT 44, Q1, 3 and 8) (13:35) (Shotgun) B.Roethlisberger pass incomplete deep right to M.Wallace. COVERAGE BY #24 HOPE

So far I've been able to use this for some interesting information in a CSV format on a play by play basis. A lot of stuff can be grabbed or determined from the play description string.