sofienkaabar / deep-learning-for-finance

The Official Repository of Deep Learning for Finance
104 stars 49 forks source link

Model bias is caculated wrong? #2

Open cg31 opened 2 months ago

cg31 commented 2 months ago

model bias is caculated as: model_bias(y_predicted)

I think it shoud be: model_bias(y_predicted - x_test)

sofienkaabar commented 2 months ago

Hi there! Thank you for your feedback. It is possible that my wording is bad, the term model bias was meant to only talk about the predicted data in the sense that they're either heavily bullish or heavily bearish. For example, a model that has 10 predictions (7 bullish predictions and 3 bearish predictions), has a model bias of (7/3) = 2.33 which means the number of bullish predictions outnumbers the number of bearish predictions by that factor, ergo the model is biased towards giving bullish signals. Hope this helps! Sofien

cg31 commented 2 months ago

I understand that. The problem is that y_predicted are absolut values.

When the real value is 100, if the model predict 101, that is bullish; if it predicts 99, it is bearish, right?

So we should use differences, 1 or -1 to check bias, not 101 or 99.

sofienkaabar commented 2 months ago

Absolutely, you may have noticed that all (or at least the vast majority) of the model use diff() function to difference the time series into returns (hence changes). That way, the predictions are stationary (meaning they resemble the following: +2%, -0.1%, -7%, +5%). This way, the model bias should work. Therefore, predicting using these model must NOT be done on price series (like 101, 100, 99, 100), but rather on their differences (0, -1, -1, 1) as explained in Chapter 3. So, y_predicted values are not absolute values (unless you saw something unclear in my code/book). Let me know if you have seen otherwise in the book or if it's unclear. Sofien

cg31 commented 2 months ago

Oh, I see. That makes sense.