Closed morefromalan closed 10 years ago
Ah wow! I was almost about to say, "Sorry, but the data is sometimes inaccurate." Instead, I dug a little bit. I'll share my process in case someone else finds it useful.
First, I wanted to reproduce it, so I ran nflstats LeSean McCoy
(from nflcmd):
[andrew@Liger ~] nflstats LeSean McCoy
Player matched: LeSean McCoy (PHI, RB)
Week W/L Date OPP R Att R Yds Y/Att R TDs WR Rec WR Tar WR Yds Y/Att WR TDs F Lost KR Yds KR TDs PR Yds PR TDs
------------------------------------------------------------------------------------------------------------------------------------
1 W Sep 9 @WAS 31 184 5.9 1 1 1 5 5.0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
2 L Sep 15 SD 11 53 4.8 0 5 6 114 22.8 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
3 L Sep 19 KC 20 158 7.9 1 0 1 0 0.0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
4 L Sep 29 @DEN 16 73 4.6 0 1 3 21 21.0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
5 W Oct 6 @NYG 20 46 2.3 1 6 8 46 7.7 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
6 W Oct 13 @TB 25 116 4.6 0 2 2 55 27.5 0 1 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
7 L Oct 20 DAL 18 55 3.1 0 6 7 26 4.3 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
8 L Oct 27 NYG 15 48 3.2 0 4 5 18 4.5 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
9 W Nov 3 @OAK 12 44 3.7 0 4 4 36 9.0 1 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
10 W Nov 10 @GB 25 155 6.2 0 1 1 6 6.0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
11 W Nov 17 WAS 20 77 3.9 2 4 4 73 18.2 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
13 W Dec 1 ARI 19 79 4.2 0 5 6 36 7.2 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
14 W Dec 8 DET 29 217 7.5 2 1 1 4 4.0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
15 L Dec 15 @MIN 8 38 4.8 0 5 7 68 13.6 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
16 W Dec 22 CHI 6 37 6.2 1 2 2 9 4.5 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
17 W Dec 29 @DAL 4 27 6.8 0 0 1 0 0.0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
- - - - 279 1407 5.0 8 47 59 517 11.0 1 1 0 0 0 0
Then I took a look at McCoy's game log, and indeed, weeks 16 and 17 are way off.
The next step is to investigate whether nflgame has the same error:
import nflgame
mccoy = nflgame.find('LeSean McCoy')[0]
stats = mccoy.stats(2013)
print stats.rushing_yds
Output: 1607
.
Yikes! I'm not sure what happened, but maybe something happened while updating. (I keep it running while the games are playing for live updates.) Once the game is considered "finished", no more updates are made. So if the update failed while the game was playing---and only updated for the first quarter, for example---then nfldb would eventually consider that game finished. Therefore, nfldb-update
wouldn't try to update it again.
Luckily, the fix is simple. Just blow away week 16 and week 17 from the database and rebuild it with nfldb-update
:
[andrew@Liger ~] psql nfldb -c "DELETE FROM game WHERE (season_year, season_type) = (2013, 'Regular') AND week IN (16, 17);"
DELETE 32
[andrew@Liger ~] nfldb-update
...
(N.B. This will automatically delete all drives, plays and stats associated with those games because of cascading foreign key constraints.)
And now nflstats LeSean McCoy
shows the correct output:
[andrew@Liger ~] nflstats LeSean McCoy
Player matched: LeSean McCoy (PHI, RB)
Week W/L Date OPP R Att R Yds Y/Att R TDs WR Rec WR Tar WR Yds Y/Att WR TDs F Lost KR Yds KR TDs PR Yds PR TDs
------------------------------------------------------------------------------------------------------------------------------------
1 W Sep 9 @WAS 31 184 5.9 1 1 1 5 5.0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
2 L Sep 15 SD 11 53 4.8 0 5 6 114 22.8 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
3 L Sep 19 KC 20 158 7.9 1 0 1 0 0.0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
4 L Sep 29 @DEN 16 73 4.6 0 1 3 21 21.0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
5 W Oct 6 @NYG 20 46 2.3 1 6 8 46 7.7 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
6 W Oct 13 @TB 25 116 4.6 0 2 2 55 27.5 0 1 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
7 L Oct 20 DAL 18 55 3.1 0 6 7 26 4.3 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
8 L Oct 27 NYG 15 48 3.2 0 4 5 18 4.5 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
9 W Nov 3 @OAK 12 44 3.7 0 4 4 36 9.0 1 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
10 W Nov 10 @GB 25 155 6.2 0 1 1 6 6.0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
11 W Nov 17 WAS 20 77 3.9 2 4 4 73 18.2 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
13 W Dec 1 ARI 19 79 4.2 0 5 6 36 7.2 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
14 W Dec 8 DET 29 217 7.5 2 1 1 4 4.0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
15 L Dec 15 @MIN 8 38 4.8 0 5 7 68 13.6 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
16 W Dec 22 CHI 18 133 7.4 2 6 6 29 4.8 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
17 W Dec 29 @DAL 27 131 4.9 0 1 2 3 3.0 1 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------------------
- - - - 314 1607 5.1 9 52 64 540 10.4 2 1 0 0 0 0
I've updated the database dump so that future builds will have the correct data.
Thank you!
Thanks @BurntSushi !
Love the DB. Spotted an issue:
If I import the DB and run this query:
Rushing yards look accurate vs ESPN/NFL records: http://espn.go.com/nfl/statistics/player/_/stat/rushing/sort/rushingYards
except for LeSean McCoy who is off by 200 yards.