BurntSushi / nfldb

A library to manage and update NFL data in a relational database.
The Unlicense
1.08k stars 264 forks source link

Help Finding Each Teams Rushing Yards Each Game #155

Open theSanchize3 opened 8 years ago

theSanchize3 commented 8 years ago

I'm trying to determine the strength of the association between wins a high amount of rushing yards. I made an empty list called over150 to store 'Y' or 'N' values depending on if the team had over 150 yards that game. I have another array called winLoss to store 'W' or 'L' values depending on if the team won. I have a list called teamsList with all the abbreviations of the teams. Finally, rushArray will combine over150 and winLoss so I can compare. I created counters for each combination of W/L and Y/N.

Here is what I have:

import nfldb

db = nfldb.connect()

over150 = []
winLoss = []
rushArray = []

teamRushYds = 0

teamRush = []

teamsList = ['ARI','STL','ATL','BAL','BUF','CAR','CHI','CIN','CLE','DAL','DEN','DET','GB','KC','HOU','IND','JAC','MIA','MIN','NE','NO','NYG','NYJ','OAK','PHI','PIT','SD','SF','SEA','TB','TEN','WAS']

for t in teamsList:
    for i in range(1,18):
        q = nfldb.Query(db)
        q.game(season_year=2015,season_type='Regular',week=i,team=t)
        q.play_player(team=t)
        pps = q.as_aggregate()
        teamRushYds = sum(pp.rushing_yds for pp in pps)
        if teamRushYds >= 150:
            over150.append('Y')
        elif teamRushYds < 150:
            over150.append('N')

        q = nfldb.Query(db)
        q.game(season_year=2015,season_type='Regular',week=i,team=t)
        for g in q.as_games():
            if g.home_team == t:
                if g.home_score > g.away_score:
                    winLoss.append('W')
                elif g.home_score < g.away_score:
                    winLoss.append('L')
                else:
                    winLoss.append('T')
            elif g.away_team == t:
                if g.home_score < g.away_score:
                    winLoss.append('W')
                elif g.home_score > g.away_score:
                    winLoss.append('L')
                else:
                    winLoss.append('T')

print over150
print len(over150)
print winLoss
print len(winLoss)

for k in xrange(len(over150)):
    rushArray.append([over150[k],winLoss[k]])

counterYW = 0
counterYL = 0
counterNW = 0
counterNL = 0

for pair in rushArray:
    if pair == ['Y','W']:
        counterYW += 1
    elif pair == ['Y','L']:
        counterYL += 1
    elif pair == ['N','W']:
        counterNW += 1
    elif pair == ['N','L']:
        counterNL += 1

print 'YW: ' + str(counterYW)
print 'YL: ' + str(counterYL)
print 'NW: ' + str(counterNW)
print 'NL: ' + str(counterNL)

counterYW counts the number of Wins with over 150 yards. counterYL counts the number of Losses with over 150 yards, etc.

The program does not run correctly, however, because over150 has 544 elements while winLoss has 512 (which it should). Is there any reason why over150 has 32 more elements than winLoss?

Please help, Thanks

ochawkeye commented 8 years ago

You are adding a "Y" or "N" to over150 17 times for each team. But each team has one bye week during the season and plays only 16 games.

Since there are 32 teams in the league, your over150 list is 32 elements longer than your winLoss list.

ochawkeye commented 8 years ago

A version of what you were after that might give you some new ideas.

import nfldb

def team_rushing_total(db, year, s_type, week, team):
    q = nfldb.Query(db)
    q.game(season_year=year, season_type=s_type, week=week, team=team)
    q.play_player(team=team)
    pps = q.as_aggregate()
    return sum(pp.rushing_yds for pp in pps)

counterYW = counterYL = counterNW = counterNL = 0

db = nfldb.connect()
q = nfldb.Query(db)
q.game(season_year=2015, season_type='Regular')
for game in q.as_games():
    if team_rushing_total(db, game.season_year, game.season_type, game.week, game.winner) > 150:
        counterYW += 1
    else:
        counterNW += 1

    if team_rushing_total(db, game.season_year, game.season_type, game.week, game.loser) > 150:
        counterYL += 1
    else:
        counterNL += 1

print 'YW: ' + str(counterYW)
print 'NW: ' + str(counterNW)
print 'YL: ' + str(counterYL)
print 'NL: ' + str(counterNL)
print '='*25
print 'Winners: {}'.format(counterYW+counterNW)
print 'Losers:  {}'.format(counterYL+counterNL)
YW: 74
NW: 182
YL: 22
NL: 234
=========================
Winners: 256
Losers:  256