Hoping for a nudge in the right direction

BrandoPolistirolo / Tennis-Betting-ML

Machine Learning model(specifically log-regression with stochastic gradient descent) for tennis matches prediction. Achieves accuracy of 66% on approx. 125000 matches

MIT License

29 stars 8 forks source link

Update on progress.

In my previous post, I was running the entire "KaggleTennis.....py" file within Jupyter by just doing 'run Kaggle_Tennis... blah'

This works, however b/c of the out of bounds error above, it would fail and exit. So this time around, I just set up each of the "In[] as its equivalent in a Jupyter notebook. Then once the loop failed at the same point (for which I still can't figure out why) I could still run the rest of the "IN[]" and get it to work. This lets me create all of the tournament css's, the 'final_kaggle_dataset.csv' files etc.

However, now I'm trying to figure out which file to run next. I think it's either Players_Name_Fix.py or Players_Data_PreProc.py. In the PreProc file, the first few lines read the file 'atp_players.csv' which is not a file within the dataset. Im wondering if this should be changed to the matching file 'all_players.csv' which IS in the dataset. Will try this and report back.

BrandoPolistirolo / Tennis-Betting-ML

Hoping for a nudge in the right direction #6