anfederico / clairvoyant

Software designed to identify and monitor social/historical cues for short term stock movement
MIT License
2.42k stars 772 forks source link

Bug in predict_proba? #15

Closed uclatommy closed 7 years ago

uclatommy commented 7 years ago

Anthony,

I suspect there may be an upstream bug in SVC.predict_proba. I believe it outputs the classifier probabilities in reverse order than what is stated in their documentation. I discovered this when setting my training period equal to my testing period. I would expect that you should get highly accurate buy and sells if you do this. However,

  1. Backtesting was producing the opposite recommendations..

  2. So I tried updating the Predict function to use model.predict_proba([Xs])[0] consistent with my suspicions.

  3. Now I'm getting correct recommendations.

Here's my data.

Note: I do not have a lot of data so I'm breaking up the modeling frequency into 30 minute intervals.

uclatommy commented 7 years ago

Looks like SVC might be buggy when the user doesn't know what they're doing with their data. In my case, I think the number of significant digits in my training features causes the model to try to account for to many possible classes. I was able to get the proper behavior when I cut down the 'influence' score to 2 or 3 digit numbers and limited the sentiment to 2 decimal places.

I consider this resolved, so I'll close.