kyleskom / NBA-Machine-Learning-Sports-Betting

NBA sports betting using machine learning
1.16k stars 431 forks source link

Question on Accuracy #334

Closed gstuart13 closed 10 months ago

gstuart13 commented 10 months ago

Hi Kyle,

Thanks again for your update on the code itself. Last night I ran a test and placed a bet on all favorites the model put out. It was a tough night with 7 wins and 13 losses. I just want to confirm that the 69% ML rate and 55% O/U rate is based on lifetime and not day over day? I'm not new to better, but certainly newer to using python based machine learning algorithms to do so (outside of course the stuff the standard sites put out).

Not upset at all, just want to set proper expectations.

cpjolicoeur commented 10 months ago

You wont hit a 69% win rate each day on the ML bets if that is what you are asking. 69% is the accuracy of the trained model over the historical dataset it was trained on for the model methodology used.

sportsfan69 commented 10 months ago

Last night was the first bad night so far for the model this season

gstuart13 commented 10 months ago

OK I thought so. I knew there was no way that was sustainable. I appreciate the quick response!

Out of curiosity, does the team have similar model for other sports such as NHL, NFL, etc.? Big fan of what you all are doing. I'm newer to python and found a great repository of NFL data going back to 1999, but not sure how to turn it into as effective a model as to what you're doing!

On Tue, Oct 31, 2023 at 9:38 AM sportsfan69 @.***> wrote:

Last night was the first bad night so far for the model this season

— Reply to this email directly, view it on GitHub https://github.com/kyleskom/NBA-Machine-Learning-Sports-Betting/issues/334#issuecomment-1787237744, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDRXYSW57XTAWEDJJXCXMLTYCD5L7AVCNFSM6AAAAAA6XWKLUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBXGIZTONZUGQ . You are receiving this because you authored the thread.Message ID: @.*** .com>

thprtm commented 10 months ago

Did you downloaded the program or runned it via colab?

gstuart13 commented 10 months ago

Ran it via colab.

On Tue, Oct 31, 2023 at 10:12 AM thprtm @.***> wrote:

Did you downloaded the program or runned it via colab?

— Reply to this email directly, view it on GitHub https://github.com/kyleskom/NBA-Machine-Learning-Sports-Betting/issues/334#issuecomment-1787301000, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDRXYSTIAZRUMG7NDJG6ME3YCEBNNAVCNFSM6AAAAAA6XWKLUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBXGMYDCMBQGA . You are receiving this because you authored the thread.Message ID: @.*** .com>

thprtm commented 10 months ago

Colab is more accurate than downloading?

gstuart13 commented 10 months ago

In what way? Sorry I don't understand the question. I did try running it in Visual Studio. Are you saying that the model would be more accurate outside of colab?

On Tue, Oct 31, 2023 at 10:37 AM thprtm @.***> wrote:

Colab is more accurate than downloading?

— Reply to this email directly, view it on GitHub https://github.com/kyleskom/NBA-Machine-Learning-Sports-Betting/issues/334#issuecomment-1787349836, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDRXYSQ52IOUKKEO3VUIQWTYCEELJAVCNFSM6AAAAAA6XWKLUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBXGM2DSOBTGY . You are receiving this because you authored the thread.Message ID: @.*** .com>

cpjolicoeur commented 10 months ago

The model accuracy wouldnt/shouldnt change no matter where you run it. Assuming you are running the same model of course against the same inputs.

thprtm commented 10 months ago

i ran the colab in the link and the program locally and got different results. Plus now colab gives error: Traceback (most recent call last): File "/content/main.py", line 139, in main() File "/content/main.py", line 111, in main data, todays_games_uo, frame_ml, home_team_odds, away_team_odds = createTodaysGames(games, df, odds) File "/content/main.py", line 52, in createTodaysGames schedule_df = pd.read_csv('Data/nba-2023-UTC.csv', parse_dates=['Date'], date_format='%d/%m/%Y %H:%M') File "/usr/local/lib/python3.10/dist-packages/pandas/util/_decorators.py", line 211, in wrapper return func(*args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/pandas/util/_decorators.py", line 331, in wrapper return func(args, **kwargs) TypeError: read_csv() got an unexpected keyword argument 'date_format'

gstuart13 commented 10 months ago

Oh sorry, everything is working great now. I just noticed that last night I got killed when following the model and then didn't know if you had similar models to NFL and NHL.

On Tue, Oct 31, 2023 at 10:51 AM thprtm @.***> wrote:

i ran the colab in the link and the program locally and got different results. Plus now colab gives error: Traceback (most recent call last): File "/content/main.py", line 139, in main() File "/content/main.py", line 111, in main data, todays_games_uo, frame_ml, home_team_odds, away_team_odds = createTodaysGames(games, df, odds) File "/content/main.py", line 52, in createTodaysGames schedule_df = pd.read_csv('Data/nba-2023-UTC.csv', parse_dates=['Date'], date_format='%d/%m/%Y %H:%M') File "/usr/local/lib/python3.10/dist-packages/pandas/util/_decorators.py", line 211, in wrapper return func(*args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/pandas/util/_decorators.py", line 331, in wrapper return func(args, **kwargs) TypeError: read_csv() got an unexpected keyword argument 'date_format'

— Reply to this email directly, view it on GitHub https://github.com/kyleskom/NBA-Machine-Learning-Sports-Betting/issues/334#issuecomment-1787375737, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDRXYSTAWEZH44ACWBUKQ2TYCEF5LAVCNFSM6AAAAAA6XWKLUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBXGM3TKNZTG4 . You are receiving this because you authored the thread.Message ID: @.*** .com>

zerotwenty commented 10 months ago

As seen on other open issue thread installing this version of pandas seems to fix the bad date format !pip3 install pandas==2.1.0

gstuart13 commented 10 months ago

Not having issues with date format, but is anyone else noticing how far off this model has been? I'll give the caveat that I literally just started tailing this week, but the results have been nowhere near the lifetime win %. I'm assuming the model learns as the season goes un, but last night, the ML predictions went 4 and 8, while the totals went 6 and 6. If you went all in, it was a 10 and 14 night. I was looking at percentages in high 60s and above and still went backwards.

Admittedly, I haven't done a ton with modeling like this in the past, but something just seems off to start.

cpjolicoeur commented 10 months ago

Several things, the overall percentages are just that - overall percentages. You wont hit those numbers in the short term (night by night). Some nights you will be way under, others you might hit 100% accuracy.

Second, the new season has just started, so while the historical data is in there and modeled for the past several seasons, each new season brings changes to the tables with roster changes and all the other things that need to shake out. Once a few weeks/months of this season have happened that data can be included in training set and will get better over time for this season.

Lastly, you can't just blindly follow the predictions, because they are just that, predictions. You need to combine that with the knowledge you have that that model doesnt. For example, there are a lot of injured players right now that didnt play last night. There was a huge trade with Philly and LAC that affected things related to those teams, etc... You gotta take that existing knowledge into account and then weigh that against what you see from what the model outputs.

ALso, I'd say going 6 out of 12 for the OU is pretty good. A 50% hit rate is honestly excellent considering I think the current OU model is only "rated" at about 55%. But if you can hit 50% you probably are going to make money on the odds

STRATZ-Ken commented 10 months ago

First, background. I train models daily for my real job. Second, nothing below is here to insult Kyle or his work. What he is providing for free is a great foundation for others to use. The fact he deals with the constant repeat of dumb questions is beyond amazing. But this is what it is, a foundation. You cannot and will not make money with the current state this project is. People who said they made money, well a broken clock is correct twice a day.

Lets just take a look at what the inputs are. They are a collection of stats collected from NBA which are a sum of all the players on the team. Has a single trade or player been added to the team since last year? Then the odds are wrong. Is a single person hurt or taking the day off today? Then the odds are wrong. Is the fact the model is trained with all the data from the previous year and we are in week 1 of the NBA with very little data collected for the new teams? The odds are wrong.

I truly hope this response is linked every time someone says they are winning or losing money with this. @kyleskom is amazing for doing this, and I hope he continues to improve his models and foundation. However, please stop with the idea that you can run this blindly, take the picks and win money, it wont happen.

cpjolicoeur commented 10 months ago

Totally agree and second what @STRATZ-Ken is saying. This project isnt meant to be, nor does it claim to be, an "fool-proof" accurate prediction tool. Its a good project and nice starting point for people who are 1) ML developers, or 2) NBA betters who are ML interested. You can take what is here, use it, learn from it, modify and tweak it for your own wants/needs. But dont expect to just run the models and make profits guaranteed :) Anyone telling you that is possible with any "system" or model is lying to you. If it were that easy, the bookies would cease to exist :)

elfrost commented 10 months ago

I started to use week ago and thank you devs for this, very well coded and I enjoyed that. I went into few files and I'm wondering if I need to add 23-24 in code or not and if I need to learn model once week or so? whats is the good practice to use as per your experience? thanks

cpjolicoeur commented 10 months ago

@elfrost yes, at some point you'd probably need/want to start pulling in data from the 2023-24 season and retraining model with that data. When you do that and at what frequency is totally up to your discretion, if at all.

gstuart13 commented 10 months ago

Hi Craig,

I truly appreciate the response. I want to be clear that nothing I am trying to say is detrimental to Kyle and the amazing work he has done. As I mentioned, I've never utilized a model like this in sports betting, so I was spending a few days testing the model to see its accuracy as is. The only reason I went down this path is because I've just heard through the grapevine of some folks who use these learning models and have been very successful in doing so.

Ken, I also appreciate your willingness to stand up for Kyel. However, I would fall short of telling people they are asking dumb questions or expecting to blindly make money. Some people are just simply asking questions, such as myself, to understand if others have noticed something similar in terms of how the model is performing. I am well aware of how sports betting works and that some weeks you hit 70% of your bets and then the next week you hit 10%. It's all an eb and flow.

At the end of the day, we are all here for the same thing, which is to take some money back from the bookies and Vegas.

On Thu, Nov 2, 2023 at 10:16 AM Craig P Jolicoeur @.***> wrote:

Totally agree and second what @STRATZ-Ken https://github.com/STRATZ-Ken is saying. This project isnt meant to be, nor does it claim to be, an "fool-proof" accurate prediction tool. Its a good project and nice starting point for people who are 1) ML developers, or 2) NBA betters who are ML interested. You can take what is here, use it, learn from it, modify and tweak it for your own wants/needs. But dont expect to just run the models and make profits guaranteed :) Anyone telling you that is possible with any "system" or model is lying to you. If it were that easy, the bookies would cease to exist :)

— Reply to this email directly, view it on GitHub https://github.com/kyleskom/NBA-Machine-Learning-Sports-Betting/issues/334#issuecomment-1790821215, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDRXYSW5QSDBSCJE5JEC55LYCOTK3AVCNFSM6AAAAAA6XWKLUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJQHAZDCMRRGU . You are receiving this because you authored the thread.Message ID: @.*** .com>

STRATZ-Ken commented 10 months ago

However, I would fall short of telling people they are asking dumb questions or expecting to blindly make money.

Please check the closed issues. This will 100% justify my comments.

gstuart13 commented 10 months ago

Not everyone is as well versed with Github or these models. If people are being rude with their comments, I could see your needing to jump in and defend the work. However, some people are just asking because they simply don't know or are looking for clarification.

Nothing I said in my comments was detrimental about the model, but simply asking about others experiences within accuracy of the model. I think Craig handled the comment well and it didn't really need any follow up from there.

On Thu, Nov 2, 2023 at 10:44 AM STRATZ-Ken @.***> wrote:

However, I would fall short of telling people they are asking dumb questions or expecting to blindly make money.

Please check the closed issues. This will 100% justify my comments.

— Reply to this email directly, view it on GitHub https://github.com/kyleskom/NBA-Machine-Learning-Sports-Betting/issues/334#issuecomment-1790871648, or unsubscribe https://github.com/notifications/unsubscribe-auth/BDRXYSSJXVLJTQT3X4RBFZTYCOWVBAVCNFSM6AAAAAA6XWKLUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJQHA3TCNRUHA . You are receiving this because you authored the thread.Message ID: @.*** .com>