Closed roodhouse closed 10 years ago
:P Yup, that sounds right.
You've uninstalled your previous pip
, right?
yes i have. i got this:
c:\Python27\Scripts>python get-pip.py
so i should do it like this instead: c:\Python27\Scripts>python < C:\Users\Rugh\Desktop\nfldbget-pip.py ?
c:\Python27\Scripts>python < c:\users\rugh\desktop\nfldb\get-pip.py
File "
i am going to re-download it with the 2nd link you provided
c:\Python27\Scripts>python < c:\users\rugh\desktop\nfldb\get-pip.py Requirement already up-to-date: pip in c:\python27\lib\site-packages Cleaning up...
Get rid of that <
.
c:\python27\python.exe c:\users\rugh\desktop\nfldb\get-pip.py
this:
c:\Python27\Scripts>python c:\users\rugh\desktop\nfldb\get-pip.py Requirement already up-to-date: pip in c:\python27\lib\site-packages Cleaning up...
then this: c:\Python27>python pip.py install nfldb Requirement already up-to-date: pip in c:\python27\lib\site-packages Downloading/unpacking install Could not find any downloads that satisfy the requirement install Some externally hosted files were ignored (use --allow-external install to all ow). Cleaning up... No distributions at all found for install Storing debug log for failure in C:\Users\Rugh\pip\pip.log
It looks like you didn't uninstall pip
(or the pip
uninstaller didn't do a very good job).
Go into C:/Python27/lib/site-packages and just delete the pip
and nfldb
directories. (The directories may be called pip-something-something
, but delete them just the same. Then retry python C:/.../get-pip.py
.
i am erasing folders called "pip" "nfldb" & "pip-1.5.6.dist-info" along with files called "nfldb-0.2.0py2.7.egg-info"
now this:
c:\Python27\Scripts>python c:\users\rugh\desktop\nfldb\get-pip.py Downloading/unpacking pip Installing collected packages: pip Successfully installed pip Cleaning up...
and then this:
c:\Python27>python pip.py install nfldb Requirement already up-to-date: pip in c:\python27\lib\site-packages Downloading/unpacking install Could not find any downloads that satisfy the requirement install Some externally hosted files were ignored (use --allow-external install to all ow). Cleaning up... No distributions at all found for install Storing debug log for failure in C:\Users\Rugh\pip\pip.log
How about c:\python27\scripts\pip.exe install nfldb
?
just tried this with this result:
c:\Python27\Scripts>python pip.exe install nfldb
Traceback (most recent call last):
File "C:\Python27\lib\runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "C:\Python27\lib\runpy.py", line 72, in _run_code
exec code in run_globals
File "pip.exemain.py", line 9, in
and what you suggested with this result:
c:\Python27\Scripts>python pip.exe install nfldb
Traceback (most recent call last):
File "C:\Python27\lib\runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "C:\Python27\lib\runpy.py", line 72, in _run_code
exec code in run_globals
File "pip.exemain.py", line 9, in
however, this time my _vendor folder is in fact missing...
wait.. no it is not... it is w/in pip.. sorry
I'm available for a TeamViewer session if you have time.
how long will you be available for? i have to run out real quick..
nevermind, lets do it
email sent..
awesome.
@ochawkeye thank you! @BurntSushi thank you!
@roodhouse Got it? Yay!
@ochawkeye I am fiercely curious what the issue was...
@BurntSushi T'was a rouge file named 'pip.py' in the Python execution folder. I guess that file was being given preferential treatment over the actual pip
module and was overriding the commands that were trying to be sent to pip. Similar to if I* were to create an 'nflgame.py' file and paste it into my working directory, Python would try to use that instead of the actual nflgame
module.
*not that I ever did that when I first started playing with nflgame and it took me two days to figure out what the heck was wrong with your "junk" python code
@ochawkeye Ah ha! Nice find.
Yeah, 99% of all Python problems are due to its insane import resolution semantics. Gah.
One reason to move to Python 3 is that pip
is now included in every Python 3.4 distribution.
I only hope the frustration of the past couple of days hasn't scared away @roodhouse! nflgame|db|vid|fan
, and Python in general, can be a very rewarding hobby.
First of all, thanks so much for putting this together, this is some amazing stuff, and I've wanted to work with data like this for a while now.
Hey, I'm reading this now, and I'm basically in the same predicament. I have gotten the nfldb db loaded in postgresql (for ease, I'll call it SQLnfldb), but may be issues with the python nfldb (PYTnfldb). I got the get-pip.py file downloaded and I ran it. I thought it installed okay, but when I tried to run the test example I get an empty space
import nfldb db = nfldb.connect() q = nfldb.Query(db) q.game(season_year=2013, season_type='regular') <nfldb.query.Query object at 0x03562A70>
I tried pressing through this (I wasn't sure if this was an error or not), and when I ran
for pp in q.sort('passing_yds').limit(10).as_aggregate(): print pp.player, pp.passing_yds
It seems to work okay for 2012 one time, but since then, I get nothing. I just wanted to confirm if I should be seeing the "Query object at 0x0#######" or if that's a sign that something is wrong, or if there's another issue in play.
Thanks again.
Strings in Python are case sensitive, so you need to watch out for that season_type='Regular'
Running this code:
import nfldb
db = nfldb.connect()
q = nfldb.Query(db)
q.game(season_year=2013, season_type='Regular')
for pp in q.sort('passing_yds').limit(10).as_aggregate():
print pp.player, pp.passing_yds
should give you this result:
Peyton Manning (DEN, QB) 5477
Drew Brees (NO, QB) 5139
Matthew Stafford (DET, QB) 4647
Matt Ryan (ATL, QB) 4515
Philip Rivers (SD, QB) 4478
Tom Brady (NE, QB) 4338
Andy Dalton (CIN, QB) 4296
Carson Palmer (ARI, QB) 4274
Ben Roethlisberger (PIT, QB) 4147
Joe Flacco (BAL, QB) 3912
Changing that one character in my code:
import nfldb
db = nfldb.connect()
q = nfldb.Query(db)
q.game(season_year=2013, season_type='regular')
for pp in q.sort('passing_yds').limit(10).as_aggregate():
print pp.player, pp.passing_yds
now gives me a Python error:
Traceback (most recent call last):
File "P:\Projects\Home Computer\Fantasy Football\2013\scratch7.py", line 6, in <module>
for pp in q.sort('passing_yds').limit(10).as_aggregate():
File "C:\Python27\lib\site-packages\nfldb\query.py", line 925, in as_aggregate
cur.execute(q)
File "C:\Python27\lib\site-packages\psycopg2\extras.py", line 223, in execute
return super(RealDictCursor, self).execute(query, vars)
psycopg2.DataError: invalid input value for enum season_phase: "regular"
LINE 8: WHERE (((game.season_type = 'regular') AND (...
Is this what you're seeing?
No... It's going through. It just seems to be... empty. I just restarted IDLE
I did this:
import nfldb db = nfldb.connect() q = nfldb.Query(db) q.game(season_year=2013, season_type='Regular')
And got this message: <nfldb.query.Query object at 0x03518030>
I then ran
for pp in q.sort('passing_yds').limit(10).as_aggregate(): print pp.player, pp.passing_yds
and got the same results as you Peyton Manning (DEN, QB) 5477 Drew Brees (NO, QB) 5139 Matthew Stafford (DET, QB) 4647 Matt Ryan (ATL, QB) 4515 Philip Rivers (SD, QB) 4478 Tom Brady (NE, QB) 4338 Andy Dalton (CIN, QB) 4296 Carson Palmer (ARI, QB) 4274 Ben Roethlisberger (PIT, QB) 4147 Joe Flacco (BAL, QB) 3912
But now, when I ran
q.game(season_year=2012, season_type='Regular') for pp in q.sort('passing_yds').limit(10).as_aggregate(): print pp.player, pp.passing_yds
underneath, I got a blank (usually I had been hitting enter and then seeing results), and then (>>>) with program waiting for me to make a command.
I guess I haven't asked what I suppose should be an obvious question, do I need to re-enter
import nfldb db = nfldb.connect() q = nfldb.Query(db) q.game(season_year=2013, season_type='Regular')
every time?
And again, thanks to yourself and burntsushi, this is super fun stuff.
Ahhh...I'm beginning to understand. The IDLE shell has it's uses, but it is not much more than a learning tool for Python.
Try clicking File
->New File
from the menu. Paste all of that code you posted above into the Untitled document window that opens and save the file with an appropriate name - maybe something like top-ten-qbs.py
. Now, in the same window where you wrote your code, click Run
->Run Module
(or simply click F5
). Control will switch back to the shell window that was open in the background and all of your Python code will execute instead of just line by line as you were typing it into the shell.
But you don't really need IDLE to do any of this. With that same file you just created (which you could have created with the text editor of your preference), you can fire up a command prompt and enter the following:
Microsoft Windows [Version 6.3.9600]
(c) 2013 Microsoft Corporation. All rights reserved.
C:\Users\OCHawkeye> python c:\path\to\my\file\top-ten-qbs.py
Drew Brees (NO, QB) 5177
Matthew Stafford (DET, QB) 4965
Tony Romo (DAL, QB) 4903
Tom Brady (NE, QB) 4799
Matt Ryan (ATL, QB) 4719
Peyton Manning (DEN, QB) 4667
Andrew Luck (IND, QB) 4374
Aaron Rodgers (GB, QB) 4303
Josh Freeman (UNK, UNK) 4065
Carson Palmer (ARI, QB) 4018
C:\Users\OCHawkeye>
do I need to re-enter ... every time?
I think you just asked a question asked by every beginner programmer. The answer, of course, is a resounding NO! That's probably why you're doing this in the first place. Sure, you could look up each of the yardage totals for each of those QBs on NFL.com and create the table yourself - or you could have the Python code do the work for you.
If you find yourself typing redundant code over and over again, there is a good chance you could be consolidating that in to a much more succinct set of instructions.
Say you wanted to find out who led the league in passing in 2013, 2012, and 2011. You could always go the route of copying and pasting your working example above, changing the season_year
value each time.
import nfldb
db = nfldb.connect()
q = nfldb.Query(db)
q.game(season_year=2013, season_type='Regular')
for pp in q.sort('passing_yds').limit(10).as_aggregate():
print pp.player, pp.passing_yds
print '-'*79
q = nfldb.Query(db)
q.game(season_year=2012, season_type='Regular')
for pp in q.sort('passing_yds').limit(10).as_aggregate():
print pp.player, pp.passing_yds
print '-'*79
q = nfldb.Query(db)
q.game(season_year=2011, season_type='Regular')
for pp in q.sort('passing_yds').limit(10).as_aggregate():
print pp.player, pp.passing_yds
There sure is a lot of code that is written and rewritten again and again there. It could easy be re-factored into the following:
import nfldb
db = nfldb.connect()
def top_10_qb_passing_yds(db, yr):
q = nfldb.Query(db)
q.game(season_year=yr, season_type='Regular')
for pp in q.sort('passing_yds').limit(10).as_aggregate():
print pp.player, pp.passing_yds
for year in [2013, 2012, 2011]:
top_10_qb_passing_yds(db, year)
print '-'*79
I see now that there is another way to interpret your question. The answer to that version is:
Do I need to re-enter <> every time?:
import nfldb
- nope, you only need to import 1 time
db = nfldb.connect()
- nope, you are only establishing a single connection to the database
q = nfldb.Query(db)
- yep, if you want to run a new query, you have to re-enter this
q.game(season_year=2013, season_type='Regular')
- yep, if you ran a new query above, this is what you would use to filter the query of the entire database down to a single year and season-type.
Do I need to re-enter <> every time?: (continued)
On the other hand, if you're doing more/different stuff with the same query, then no, you don't need to re-enter the top lines every time.
For example, if I'm doing multiple sorts of the same data pulled from the database, then I only need pull the data from the database the one time. The following is perfectly valid code.
import nfldb
db = nfldb.connect()
q = nfldb.Query(db)
q.game(season_year=2013, season_type='Regular')
for pp in q.sort('passing_yds').limit(3).as_aggregate():
print pp.player, pp.passing_yds
print '-'*79
for pp in q.sort('rushing_yds').limit(3).as_aggregate():
print pp.player, pp.rushing_yds
print '-'*79
for pp in q.sort('receiving_yds').limit(3).as_aggregate():
print pp.player, pp.receiving_yds
Peyton Manning (DEN, QB) 5477
Drew Brees (NO, QB) 5139
Matthew Stafford (DET, QB) 4647
-------------------------------------------------------------------------------
LeSean McCoy (PHI, RB) 1607
Matt Forte (CHI, RB) 1341
Jamaal Charles (KC, RB) 1288
-------------------------------------------------------------------------------
Josh Gordon (CLE, WR) 1646
Calvin Johnson (DET, WR) 1489
Antonio Brown (PIT, WR) 1438
I tried to run the update module, and it looks like it worked somewhat, but I got some errors too.
I get an error saying "no module named httplib2" and a separate error saying python.exe -m nflgame.update_players --no-block' failed (exit status 1)
otherwise, it seemed to load okay.
for year in [2013, 2012, 2011]: top_10_qb_passing_yds(db, year) print '-'*79
I was about to ask was the 'print '-'*79 was, then I ran it and realized it was a separator.
I have another question, which may be something that's a totally different "issue", or an answer or already exists
When I ran that top 10 QBs for 2012 I got this result in the 10
Josh Freeman (UNK, UNK) 4065
Is there a simple method to refer to the team he played for during that season (TB), rather than UNK, (which is his current situation)?
Thanks again.
httplib2
is a dependency of nfldb
and should have been installed if you used pip
to do the nfldb
installation.
What does it say if you try to pip install nfldb
now? Should look like the following:
C:\Users\Ben>pip install nfldb
Requirement already satisfied (use --upgrade to upgrade): nfldb in d:\python27\l
ib\site-packages
Requirement already satisfied (use --upgrade to upgrade): nflgame>=1.2.2 in d:\p
ython27\lib\site-packages (from nfldb)
Requirement already satisfied (use --upgrade to upgrade): psycopg2 in d:\python2
7\lib\site-packages (from nfldb)
Requirement already satisfied (use --upgrade to upgrade): enum34 in d:\python27\
lib\site-packages (from nfldb)
Requirement already satisfied (use --upgrade to upgrade): pytz in d:\python27\li
b\site-packages (from nfldb)
Requirement already satisfied (use --upgrade to upgrade): httplib2 in d:\python2
7\lib\site-packages (from nflgame>=1.2.2->nfldb)
Requirement already satisfied (use --upgrade to upgrade): beautifulsoup4 in d:\p
ython27\lib\site-packages (from nflgame>=1.2.2->nfldb)
Cleaning up...
Notice the line
Requirement already satisfied (use --upgrade to upgrade): httplib2 in d:\python2
7\lib\site-packages (from nflgame>=1.2.2->nfldb)
Huh... now I get an error message 'pip' is not a recognized as an internal or external command.
last time, I installed by doing python pip.exe install nfldb (or maybe pip.exe install nfldb) but from the directory of c:\python27\Scripts
I tried doing it from that directory and it worked, the first 5 were the same as yours, but I didn't have the httplib or beautifulsoup lines.
Is there a simple method to refer to the team he played for during that season (TB), rather than UNK, (which is his current situation)?
Interesting question and one I'm not immediately able to provide an answer for. Of course, it can be a complex scenario with what you call a player that ends up hopping from team to team to team, but truth is that nfldb
"knows" he played for Tampa Bay that year even though that fact is not explicitly tied to Josh Freeman's meta data anywhere.
import nfldb
db = nfldb.connect()
q = nfldb.Query(db)
q.game(season_year=2012, season_type='Regular')
q.play_player(team='TB')
for pp in q.sort('passing_yds').limit(3).as_aggregate():
print '%s - %s yards passing' % (pp.player, pp.passing_yds)
for pp in q.sort('rushing_yds').limit(3).as_aggregate():
print '%s - %s yards rushing' % (pp.player, pp.rushing_yds)
for pp in q.sort('receiving_yds').limit(3).as_aggregate():
print '%s - %s yards receiving' % (pp.player, pp.receiving_yds)
Josh Freeman (UNK, UNK) - 4065 yards passing
Dan Orlovsky (DET, QB) - 51 yards passing
Mike Williams (BUF, WR) - 28 yards passing
Doug Martin (TB, RB) - 1454 yards rushing
LeGarrette Blount (PIT, RB) - 151 yards rushing
Josh Freeman (UNK, UNK) - 135 yards rushing
Vincent Jackson (TB, WR) - 1384 yards receiving
Mike Williams (BUF, WR) - 996 yards receiving
Doug Martin (TB, RB) - 472 yards receiving
I'm sure @burntsushi can give a thorough explanation.
Re: 'pip' is not a recognized as an internal or external command
Just need to add C:\Python27\Scripts
to your system's PATH
ENVIRONMENT VARIABLE
see #21
@iliketowel With regards to the team for Josh Freeman. This gets a bit hairy.
Here is a central truth about the data in nfldb
: its meta data about a player is always current. The data is meant to capture information about the player as he exists this moment. This means that only an active roster spot will give a player a team. This meta data is what you get when you use pp.player
. Namely, it retrieves player meta for the player statistic pp
.
With that said, every individual statistic for a player also has a team attached to it. This is historical data, so that the proper team for every player stays fixed.
So that means, if you're listing individual play statistics, you can always print the right team. For example:
import nfldb
db = nfldb.connect()
q = nfldb.Query(db)
q.game(season_year='2012', season_type='Regular', week=1)
q.player(full_name='Josh Freeman')
q.play_player(passing_yds__ge=15)
for pp in q.as_play_players():
print pp.player.full_name, pp.team, pp.passing_yds
And the output:
[andrew@Liger nfldb] python2 33.py
Josh Freeman TB 15
Josh Freeman TB 33
Josh Freeman TB 21
Josh Freeman TB 15
Notice that the team is outputted as pp.team
instead of from pp.player
. This means that team
here is a property of the statistic itself rather than meta data about the player. For example, if you changed pp.team
to pp.player.team
, then it would say UNK
instead because it is accessing current knowledge about the player.
Now, finally, we can get to your particular example. It is difficult because you are aggregating results over a season. A player doesn't necessarily have the same team over a season, so when you aggregate statistics, the pp.team
field gets dropped. So let's try it:
import nfldb
db = nfldb.connect()
q = nfldb.Query(db)
q.game(season_year='2012', season_type='Regular')
q.sort('passing_yds').limit(10)
for pp in q.as_aggregate():
print pp.player.full_name, pp.team, pp.passing_yds
And now the output:
[andrew@Liger nfldb] python2 33.py
Drew Brees None 5177
Matthew Stafford None 4965
Tony Romo None 4903
Tom Brady None 4799
Matt Ryan None 4719
Peyton Manning None 4667
Andrew Luck None 4374
Aaron Rodgers None 4303
Josh Freeman None 4065
Carson Palmer None 4018
That's not very nice, so therefore, the example takes an easier path: it just shows you the team that the player currently belongs to.
If you'd like to tumble down the rabbit hole and fix this for real, then you need to find all teams that a player played for. There are lots of ways to do this, but basically, you'd want to find all the individual plays and accrue all unique teams in those plays for that player. For example, last year, Trent Richardson played on a couple teams. We could discover this by looking at the team
field on all of his statistics:
import nfldb
db = nfldb.connect()
q = nfldb.Query(db)
q.game(season_year='2013', season_type='Regular')
q.player(full_name='Trent Richardson')
teams = set()
for pp in q.as_play_players():
teams.add(pp.team)
print ', '.join(teams)
And that outputs:
[andrew@Liger nfldb] python2 33.py
IND, CLE
Just as you'd expect.
Thanks again. I figured out the other issue I was having. Because I installed NFLGame first, I didn't use pip at the time. I did it on a different computer and after some minor issues (I'm finding issues with trying to install when using PostGRESQL 64 bit version), I was able to get this up and running.
For now I'm mostly playing with this data in the POSTGRESQL DB GUI, and because I work in visual analytics I'm trying to create fun and interesting visualizations out of the information. I may end up putting some of this stuff up on Tableau (and Tableau Public), but I'd like to confirm how/if you want to be credited.
For now, I added fields to confirm direction of the play, and whether it was in shotgun. I'm also working to see if I can figure out a way to see how players doing against starters in preseason (to see if there is any carryover), but that's difficult to do, because there's no real way to tell when starters exit pre-season games.
Yes, please do put things up on Tableau and share them with us when you do. :-) (Opening a new issue or adding it to the wiki is perfectly acceptable.)
As far as credit goes... Having a shout out to the project (not me) is always appreciated. It helps increase awareness and hopefully attracts more folks. But of course, nfldb
and associated projects are in the public domain like SQLite, so you could in theory copy the code, rebrand it, claim you wrote it and sell it, and it'd all be nice and legal. :-)
First, thank you for putting this together. I am excited about it.
However, as stated I am a newb and I've found myself lost.
I made it through the install instructions for Windows however each time a try to run either the top-ten-qbs.py or nfldb-update I get an error that says: "ImportError: No module named pytz"
Not sure what I have done wrong or missed. Could you help me troubleshoot.
Thank you! -John