kochlisGit / ProphitBet-Soccer-Bets-Predictor

ProphitBet is a Machine Learning Soccer Bet prediction application. It analyzes the form of teams, computes match statistics and predicts the outcomes of a match using Advanced Machine Learning (ML) methods. The supported algorithms in this application are Neural Networks, Random Forests & Ensembl Models.
MIT License
281 stars 104 forks source link

Error in matches predictions #71

Open andony-arrieula opened 4 months ago

andony-arrieula commented 4 months ago

There are 2 errors in the match prediction section:

The match prediction is not usable then.

andony-arrieula commented 4 months ago

It seems it's not the data not in the good order, but the preprocessing done in preprocess_dataset which is not performed on the prediction data.

kochlisGit commented 4 months ago

This should happen only during Cross-Validation process and it's perfectly normal. The idea is that we randomly selected a train set and a n evaluation set, to measure the performance of the models.

However, during the training period (ONLY), it should use the first 100 matches as test and the rest of the dataset should be processed in the correct order! Otherwise, can you print the results of the preprocess_dataset for the evaluation data to verify this?

andony-arrieula commented 4 months ago

The problem is not on the evaluation data, but on the prediction data passed to the predict match dialog.

For example, I launch my trained model on Ligue 1 French league, I select Paris SG as home team (the best team in the league) and Clermont as away team (the worst team in the league), and I enter the same odd (3.00) for all possible results, the algorithm gives me probabilities of 0.32 for Paris 0.23 a draw and 0.44 for Clermont, which is totally incoherent, and this is because the data given to the model is not processed before it was passed to the model.

kochlisGit commented 4 months ago

So the program grabs the features from the history tables, but does not preprocess them beforing passing them to model for prediction?

andony-arrieula commented 4 months ago

Exactly !

But the program also does not update the statistics with the data of the previous match !

kochlisGit commented 4 months ago

Thanks. I will take a look into it.

andony-arrieula commented 4 months ago

I am also looking on that issue on my own side, I think the best way to proceed is to modify the construct_input() method to process the data before giving it to the model to make the prediction.

kochlisGit commented 4 months ago

Yeah, I think that would be the best way. The rows should be processed before returned using the scaler of them model's config (if any)