Open cg31 opened 2 months ago
Hi there! Thank you for your feedback. It is possible that my wording is bad, the term model bias was meant to only talk about the predicted data in the sense that they're either heavily bullish or heavily bearish. For example, a model that has 10 predictions (7 bullish predictions and 3 bearish predictions), has a model bias of (7/3) = 2.33 which means the number of bullish predictions outnumbers the number of bearish predictions by that factor, ergo the model is biased towards giving bullish signals. Hope this helps! Sofien
I understand that. The problem is that y_predicted are absolut values.
When the real value is 100, if the model predict 101, that is bullish; if it predicts 99, it is bearish, right?
So we should use differences, 1 or -1 to check bias, not 101 or 99.
Absolutely, you may have noticed that all (or at least the vast majority) of the model use diff() function to difference the time series into returns (hence changes). That way, the predictions are stationary (meaning they resemble the following: +2%, -0.1%, -7%, +5%). This way, the model bias should work. Therefore, predicting using these model must NOT be done on price series (like 101, 100, 99, 100), but rather on their differences (0, -1, -1, 1) as explained in Chapter 3. So, y_predicted values are not absolute values (unless you saw something unclear in my code/book). Let me know if you have seen otherwise in the book or if it's unclear. Sofien
Oh, I see. That makes sense.
model bias is caculated as: model_bias(y_predicted)
I think it shoud be: model_bias(y_predicted - x_test)