roclark / sportsipy

A free sports API written for python
MIT License
496 stars 191 forks source link

Individual Player Stats in Box Score? #17

Closed j-andrews7 closed 5 years ago

j-andrews7 commented 5 years ago

Is your feature request related to a problem? Please describe. I'm sure you've considered this, but being able to get individual player stats for a given game would be wonderful. Particularly for the NCAA basketball module, though I'm sure all of them could use it.

Describe the solution you'd like I'd guess that it'd be easiest to do this within the BoxScore class, but I suppose you could also do it by passing in the boxscore URI to a Player object similar to how you do it by season now. Just by glancing at the code for it, it seems like the former would be easier than the latter, though perhaps not as elegant.

Describe alternatives you've considered I've looked just about everywhere for free NCAAB player data, and I'm pretty sure the only real solution right now is to scrape. Which while not necessarily difficult, is annoying given how often most stats sites change. This is by far the most complete API with any sort of wrapper that I've seen, so I really appreciate what you've done already. Even if you are a Carsen Edwards fan ;)

It'd be great to hear if this is in your plans at all.

roclark commented 5 years ago

Hey @j-andrews7, thanks for the suggestion, though you better not be hatin on Carsen! :laughing:

This is actually something that I’ve had in the back of my mind ever since the beginning and have wanted to implement it at some point. There’s a lot of valuable information that can be gained from individual player stats within each game (like the dominance of NPOY candidate Carsen Edwards). I’ve been waiting to include this until I finished both the Boxscore and Player classes, but now that both are complete, I think it would be wise to flag this as a higher priority.

I think I agree with your first suggestion, it would be good to include it in the Boxscore class at the very least, but perhaps each player will have a sub-class instance within the Boxscore. I will do some more thinking on this, but I really like the idea and will include it with a future release. I’ve been meaning to get out version 0.3.0 with some other boxscore updates, but I will see if I can include this with that release, or prioritize it for the next one.

Thanks again for the suggestion and for the kind words! Always happy to see people finding the program useful and love getting suggestions to improve it! Have a good one!

thegarnetandgold commented 5 years ago

I just want to second the request - I was trying to easily get individual stats by game to do a comparison.

No FSU players finished the game today against UVA in double digits for points. Was trying to figure out if that happened anytime else so far this season. It looks like the best place to find it is through the game logs (https://www.sports-reference.com/cbb/players/terance-mann-1/gamelog/2019/) or the box scores (https://www.sports-reference.com/cbb/boxscores/2019-01-01-14-florida-state.html) - I don't know which would be easier to scrape, because the game logs would have to be done for each player, but OTOH, the box scores have to be done by game.

BTW - the answer is that this was the first time this season. I used the Player Game finder and exported to Excel and pivoted it out.

roclark commented 5 years ago

Apologies for the delayed response here. I think I have a good idea on how to implement this now and will wrap it into my next release. It might take some time for me to fully develop the solution, but I will get it out as soon as I can! I'm planning on creating a new class in the sportsreference.ncaab.boxscore module similar to the Player class in the roster module, but only targeting single-game stats. Perhaps in the future I will create a major refactor to combine classes and make them more modular, but I feel this is the best solution for the time being.

I will let you know when I include this with the code. Thanks again both for the suggestion!

roclark commented 5 years ago

Hello again! Just wanted to share that I've been working on this a bit more lately and finally have some code that I'm satisfied with. You can see my work in the add-player-stats-to-boxscore branch if you want an early preview. As mentioned above, I created another class in the boxscore module, but I went a step further and created an abstract class in a different module which contains all of the duplicate properties between the two classes. That way, I figured I would reduce the amount of redundant code possible and make it easier to create future changes in a single place instead of having multiple copies that can get out-of-sync. Another priority of mine was to keep the method of calling the Player class in the roster module the same between the existing implementation and the new method to ensure any existing code that utilizes that class doesn't break.

I won't merge this with master until I have finished doing something similar for all 5 of the other sports and add documentation, but the NCAAB code is working and I am not planning on changing it from its current state unless needed or requested. Feel free to check it out and play around with it, and let me know if you experience any issues.

Lastly, here are some code samples that you can use to give you an idea:

from sportsreference.ncaab.roster import Player

p = Player('carsen-edwards-1')  # Sorry, not sorry @j-andrews7!
print(p.dataframe)
print(p.field_goal_percentage)

from sportsreference.ncaab.boxscore import Boxscore

def print_stats(players):
    for player in home_team:
        print(player.name)
        print(player.points)
        print(player.dataframe)

b = Boxscore('2017-11-24-21-purdue')
home_team = b.home_players  # Retrieve a list of players from the home team
print_stats(home_team)
away_team = b.away_players  # Retrieve a list of players from the away team
print_stats(away_team)
j-andrews7 commented 5 years ago

Louisville plays tonight, so I will do either a pre or post game stats breakdown using it. Very excited. Almost as excited as I am to watch Carsen Edwards shoot 30% in Purdue's next big game. 😬

This is excellent work, thank you. I will let you know how it goes.

j-andrews7 commented 5 years ago

So I used this while writing this blog post over the last few days. Overall, it works great. Only other thing I might want would be more of the advanced stats from each game, like eFG%, TS%, OffRtg, when getting individual players from the box score. I expect it would be tedious to add, as I assume you'd have to add more to the scraper, but it'd be pretty useful.

Thanks again.

roclark commented 5 years ago

@j-andrews7 thanks for reporting back! So I'm curious, is there any particular game that you are having troubles with? I am able to get most of the advanced stats (more on that later) from the new code. Here's an example:

from sportsreference.ncaab.boxscore import Boxscore

game = Boxscore('2019-01-23-19-ohio-state')
away_team = game.away_players
for player in away_team:
    print(player.name)
    print('eFG:', player.effective_field_goal_percentage)
    print('TS%:', player.true_shooting_percentage)
    print('3PAr', player.three_point_attempt_rate)

Does this work on your end?

As mentioned above, I am pulling most of the advanced stats. Currently, I'm missing both the offensive and defensive ratings. Since these weren't a part of the Player class, I didn't originally add them in, but I should do that now for completions sake (I'm having to do this for the other sports, so I will go back and add it here). Perhaps the fact that the offensive/defensive ratings weren't included threw things off?

BTW, that's a fantastic article! Though I personally don't follow Louisville (Purdue grad myself and otherwise house-divided with KU and Michigan State grads), I always love seeing these articles and learning more about players on teams that I don't pay as close of attention to. And, of course, I love the fact that this package helped make that article possible! That's the whole reason I started this project, so I'm glad to know that this is being utilized. Thanks for the shout out at the end BTW. 😃

Let me know if things aren't working for you and I will include the ratings in the meantime. I hope to make more progress on the other sports over the next few days and get a new release out.

Thanks as always!

j-andrews7 commented 5 years ago

Yeah, that works fine, sorry, I should have been more careful and specific, it is just the ratings that are missing, I think.

And thanks. I'm a little peeved I wasn't able to get the plots to embed, but such is life. I work with big data a lot, so naturally that's where my mind goes with sports. I've been looking for a decent API for ages, and this finally fits the bill. I was too lazy to write a halfway decent scraper myself, so I'm glad you took this up! I know the house divided feeling, I've got a sibling that's a UK grad and the hatred is real. Purdue's has some great wins in conference play, here's to hoping they keep it going.

Looking forward to the new release. I'm sure I'll think of something else to bug you with soon enough. Cheers.

roclark commented 5 years ago

Finally finished this work and created PR #51 to get this merged with master. I apologize for the delay in completing this, but I'm happy that it's coming together. Assuming the test suite passes, I will merge this and finally close this issue, but please let me know if you find there is anything else missing.

Thanks again for including this idea and the feedback along the way! This is a very useful feature to include, and if helps you write more blog articles, that's fantastic! Keep the great ideas coming. 😃