BurntSushi / nfldb

A library to manage and update NFL data in a relational database.
The Unlicense
1.08k stars 264 forks source link

Slow Query #132

Closed elbracht closed 8 years ago

elbracht commented 8 years ago

Hi everybody.

I use the following code to get stats (passing yards, rushing yards, ...) for each team. This takes usually around 10 seconds.

Is there any way to optimize the code?

import nfldb

db = nfldb.connect()

for team in nfldb.team.teams:
    q = nfldb.Query(db)
    q.game(season_year="2015", season_type="Regular", finished="True")
    q.play_player(team=team[0])

    passing_yds = 0
    rushing_yds = 0
    sacks = 0
    touchdowns = 0
    fieldgoals = 0
    fieldgoals_attempts = 0

    for x in q.as_aggregate():
        passing_yds += x.passing_yds
        rushing_yds += x.rushing_yds
        sacks += x.passing_sk
        touchdowns += x.defense_frec_tds
        touchdowns += x.defense_int_tds
        touchdowns += x.defense_misc_tds
        touchdowns += x.fumbles_rec_tds
        touchdowns += x.kicking_rec_tds
        touchdowns += x.kickret_tds
        touchdowns += x.puntret_tds
        touchdowns += x.receiving_tds
        touchdowns += x.rushing_tds
        fieldgoals += x.kicking_fgm
        fieldgoals_attempts += x.kicking_fga
BurntSushi commented 8 years ago

You might try running a single query instead of a query for each team, and aggregating in Python using a dictionary keyed on team.

The fastest way is to write a SQL query that does all of the aggregation in database.

elbracht commented 8 years ago

Thank you. I use a SQL query for the aggregation in database. This runs faster.