Open Tplhardy opened 5 years ago
My thought was to normalize each column in MM Data each year to avoid having to do a-b with negative numbers.
Hmm that's a good point you bring up, I didn't really check for that, but yeah I tried it myself and a couple times the results are similar, but sometimes there is quite a bit of difference.
I think it also kind of depends on what ML model you're using since they are each different in the ways that they handle high dimensional data. But yeah, I think normalizing would be a good first step to try.
At some level I'm not sure what we can do. We conceptually know these two training examples should be "equivalent" to each other. Example # 1: Input: 17 dimensional game vector X, Label: 1 Example # 2: Input: Negative of X, Label: 0
But the problem is that a machine learning model won't necessarily pick up on that. Wonder if there's a way to hard code that constraint. Not sure rn, but thanks for bringing it up!
Thanks! So I tried absolute values, and it just resulted in some overfitting (was getting 99% accuracy and predictions with 99% probability). I think to solve for this in the short term (brackets due tomorrow) is, in case the two predictions are conflicting, take the difference of the two:
I'm getting different predictions depending on whether which team I put in first and second position. Any fixes?