Open AhmMontasser opened 6 years ago
That's probably overfitting, those values are calculated on the test set
@rpfeynman how is it overfitting ? doing well on test set then going worse and worse on the same test set , is this overfitting ?
A good blog post about "overfitting" in RL: https://medium.com/mlreview/making-sense-of-the-bias-variance-trade-off-in-deep-reinforcement-learning-79cf1e83d565
Hello,
While training, i some times face a case where the portfolio value increases to a limit then gradually decreases till the end of the training steps here is an example: step 0
the portfolio value on test set is 4.095204 log_mean is 0.0005078592 loss_value is -0.000508 log mean without commission fee is 0.000529
==============================
average time for data accessing is 0.0012977843284606933 average time for training is 0.009977315187454223
step 1000
the portfolio value on test set is 4.525525 log_mean is 0.0005438528 loss_value is -0.000544 log mean without commission fee is 0.000585
==============================
average time for data accessing is 0.00144400954246521 average time for training is 0.011899509906768798
step 2000
the portfolio value on test set is 4.716191 log_mean is 0.000558717 loss_value is -0.000559 log mean without commission fee is 0.000623
==============================
average time for data accessing is 0.0014981653690338136 average time for training is 0.012066781520843506
step 3000
the portfolio value on test set is 5.136598 log_mean is 0.00058947725 loss_value is -0.000589 log mean without commission fee is 0.000768
==============================
average time for data accessing is 0.0013505065441131593 average time for training is 0.010587116718292237
step 4000
the portfolio value on test set is 6.200308 log_mean is 0.00065727555 loss_value is -0.000657 log mean without commission fee is 0.001154
==============================
average time for data accessing is 0.0014070169925689698 average time for training is 0.010920660257339478
step 5000
the portfolio value on test set is 5.680704 log_mean is 0.00062574673 loss_value is -0.000626 log mean without commission fee is 0.001350
==============================
average time for data accessing is 0.0013507211208343506 average time for training is 0.010532096147537232
step 6000
the portfolio value on test set is 5.238808 log_mean is 0.0005965769 loss_value is -0.000596 log mean without commission fee is 0.001481
==============================
average time for data accessing is 0.0013532636165618896 average time for training is 0.010635793209075928
step 7000
the portfolio value on test set is 4.790911 log_mean is 0.00056438043 loss_value is -0.000564 log mean without commission fee is 0.001594
==============================
average time for data accessing is 0.001356858253479004 average time for training is 0.010408684253692627
step 8000
the portfolio value on test set is 4.475946 log_mean is 0.000539884 loss_value is -0.000540 log mean without commission fee is 0.001697
does anybody know what could be the possible reason for this ?