seemethere / nba_py

Python client for NBA statistics located at stats.nba.com
BSD 3-Clause "New" or "Revised" License
1.05k stars 254 forks source link

Need Help with basic scraping #108

Open WillPanarese opened 6 years ago

WillPanarese commented 6 years ago

I am trying to build an NBA predictions model using python for an independent study. I am new to coding but understand python pretty well. Before I try building a model myself, I want to try to run an already published model to understand how it works, but am having trouble scraping data with various NBA API's. Was wondering if you could help me with updating a published model so that it runs with data from the current season. If you have any other github threads or resources (i.e tutorials) that you think would be helpful, let me know. If not, if you could quickly run through the basics of scraping NBA data using this API, that would be greatly appreciated.

inoble commented 6 years ago

Having experimented with the code, and after reading this issue, I get the feeling nba_py doesn't work properly any more because the NBA changed their API settings. I tried getting in touch with @seemethere via Reddit but haven't heard back.

rneu31 commented 6 years ago

It works properly, I use it every day! Where can I help? I don't know if many of these issues get responses, but I try my best to help where I can (I'm just a random stranger).

inoble commented 6 years ago

@rneu31 - I ran the following code and it gave this output. The .json data looks really incomplete, but probably I'm doing something wrong (I'm relatively new to Python and trying to get involved in open source projects to help me learn). Accompanied by that other issue I thought nba_py was no longer active.

How do you make use of the code? I might come back to it now and take a look.

inoble commented 6 years ago

oops here's the code I wrote whilst testing

`from nba_py import _api_scrape import player

first_test = player.PlayerList()

playerlist = _api_scrape(first_test.json, 0)

f = open(r'text.txt', 'w')

f.write(str(playerlist))`

rneu31 commented 6 years ago

This package works with and without pandas installed. When pandas is installed, it uses pandas dataframes to store the information. Dataframes don't print all the information by default which is why what you saved to a file looks so weird.

For starters, I would uninstall the pandas package on your machine (pip uninstall pandas).

To print the list of players and their associated IDs and other information:

from nba_py.player import PlayerList
players = PlayerList()
for player in players.info():
    print(player)

To look up the game log for the first player:

from nba_py.player import PlayerList, PlayerGameLogs

# Get list of players
players = PlayerList()

# Get the first player from that list
first_player = players.info()[0]

# See what kind of information is accessible
print(first_player)

# Determine the player's ID in NBA's database
id = first_player['PERSON_ID']

# Get the game logs for that player
game_logs = PlayerGameLogs(id).info()

# Print all the games
for game in game_logs.info():
    print(game)
inoble commented 6 years ago

thanks @rneu31

Even after uninstalling pandas and using your code above I get incomplete data. I saw your map of Wisconsin and searched for Jabari Parker and Giannis Antetokounmpo but they're not in the output.

However the second bit of code you posted definitely does return Alexis Ajinca, the first player on the list.

Apologies for derailing this issue with my own issues! I was hoping to contribute to nba_py not burden it! :)

WillPanarese commented 6 years ago

@rneu31 thanks so much for your help. You say you use nba_py everyday, do you have a model of your own? This may be incredibly useless and way too general of a question, but do you have any tutorials or resources that you used to help you figure it out? As a guy who is new to coding, having some trouble even trying to run someone else's code, so any help or direction would be incredible.

inoble commented 6 years ago

hey @WillPanarese, do as @rneu31 is saying, but here's some more expanded instructions from the very beginning:

With any version of Python 3 installed, go to the repo homepage and click to 'Clone or Download' and 'Download Zip'. Unzip the folder anywhere on your harddrive, can be in your Downloads folder or whatever, and inside are three folders - docs, nba_py and tests. Open up nba_py and you've got all the code in there.

Create a new empty document in that nba_py folder, alongside all the other files, and give it the filename whatever.py, it doesn't matter what it's called just that it's a .py python file.

Open the folder in a text editor or IDE and copy/paste the following code inside (basically what @rneu31 gave us above but corrected):

# this code is imported from the classes in player.py called PlayerList and PlayerGameLogs
from player import PlayerList, PlayerGameLogs

# Get list of players
players = PlayerList()

# Get the first player from that list
first_player = players.info()[0]

# See what kind of information is accessible
print(first_player)

# Determine the player's ID in NBA's database
# you can delete everything after the equals = sign in the next line and replace it with a player's ID in quotation marks if you want, like the following is Lebron (id = '2544').  I know it's Lebron because I navigated to Lebron's profile (https://stats.nba.com/player/2544/) on nba.com and it gave him the ID 2544
id = first_player['PERSON_ID']

# Get the game logs for that player
# the PlayerGameLogs class requested in the next line of this code only needs the id variable to work, but if you go searching through the other .py files you'll notice that different classes require different information e.g. Season = '2017-18'
game_logs = PlayerGameLogs(id)

# Print all the games
# if you open up players.py in the nba_py folder and find the PlayerGameLogs class you'll notice at the end of the class is a def info(): bit.  The .info refers to headers within the JSON data that's pulled down from nba.com
for game in game_logs.info():
    print(game)

Then save the file, open a command prompt, navigate to the nba_py folder where you saved the file and allow python to run it (ie. type the command: python whatever.py, or py whatever.py, or however your version of python and operating system allow you to run python files - there's lots of different ways).

I know that was some rough and tumble instructions so let me know if you need more help!

inoble commented 6 years ago

Oh also @WillPanarese

Within each .py file, inside of each class, is a line that looks like

endpoint = biglongword

If you copy that biglongword (e.g. commonallplayers inside the PlayerList class) and then open a web browser like Chrome, then navigate to stats.nba.com/stats/commonallplayers - it will tell you what variables are required that you need to add to the URL

the variables have to be preceeded by a question mark ? and joined by an ampersand &

http://stats.nba.com/stats/commonallplayers?LeagueID=00&Season=2017-18&IsOnlyCurrentSeason=0

hope that all helps