AhmMontasser commented 6 years ago

Hello,

While training, i some times face a case where the portfolio value increases to a limit then gradually decreases till the end of the training steps here is an example: step 0

the portfolio value on test set is 4.095204 log_mean is 0.0005078592 loss_value is -0.000508 log mean without commission fee is 0.000529

==============================

average time for data accessing is 0.0012977843284606933 average time for training is 0.009977315187454223

step 1000

the portfolio value on test set is 4.525525 log_mean is 0.0005438528 loss_value is -0.000544 log mean without commission fee is 0.000585

==============================

average time for data accessing is 0.00144400954246521 average time for training is 0.011899509906768798

step 2000

the portfolio value on test set is 4.716191 log_mean is 0.000558717 loss_value is -0.000559 log mean without commission fee is 0.000623

==============================

average time for data accessing is 0.0014981653690338136 average time for training is 0.012066781520843506

step 3000

the portfolio value on test set is 5.136598 log_mean is 0.00058947725 loss_value is -0.000589 log mean without commission fee is 0.000768

==============================

average time for data accessing is 0.0013505065441131593 average time for training is 0.010587116718292237

step 4000

the portfolio value on test set is 6.200308 log_mean is 0.00065727555 loss_value is -0.000657 log mean without commission fee is 0.001154

==============================

average time for data accessing is 0.0014070169925689698 average time for training is 0.010920660257339478

step 5000

the portfolio value on test set is 5.680704 log_mean is 0.00062574673 loss_value is -0.000626 log mean without commission fee is 0.001350

==============================

average time for data accessing is 0.0013507211208343506 average time for training is 0.010532096147537232

step 6000

the portfolio value on test set is 5.238808 log_mean is 0.0005965769 loss_value is -0.000596 log mean without commission fee is 0.001481

==============================

average time for data accessing is 0.0013532636165618896 average time for training is 0.010635793209075928

step 7000

the portfolio value on test set is 4.790911 log_mean is 0.00056438043 loss_value is -0.000564 log mean without commission fee is 0.001594

==============================

average time for data accessing is 0.001356858253479004 average time for training is 0.010408684253692627

step 8000

the portfolio value on test set is 4.475946 log_mean is 0.000539884 loss_value is -0.000540 log mean without commission fee is 0.001697

does anybody know what could be the possible reason for this ?

astanziola commented 6 years ago

That's probably overfitting, those values are calculated on the test set

AhmMontasser commented 6 years ago

@rpfeynman how is it overfitting ? doing well on test set then going worse and worse on the same test set , is this overfitting ?

istvanmo commented 6 years ago

A good blog post about "overfitting" in RL: https://medium.com/mlreview/making-sense-of-the-bias-variance-trade-off-in-deep-reinforcement-learning-79cf1e83d565

ZhengyaoJiang / PGPortfolio

portfolio value increases then decreases after some steps #69

While training, i some times face a case where the portfolio value increases to a limit then gradually decreases till the end of the training steps here is an example: step 0

average time for data accessing is 0.0012977843284606933 average time for training is 0.009977315187454223

step 1000

average time for data accessing is 0.00144400954246521 average time for training is 0.011899509906768798

step 2000

average time for data accessing is 0.0014981653690338136 average time for training is 0.012066781520843506

step 3000

average time for data accessing is 0.0013505065441131593 average time for training is 0.010587116718292237

step 4000

average time for data accessing is 0.0014070169925689698 average time for training is 0.010920660257339478

step 5000

average time for data accessing is 0.0013507211208343506 average time for training is 0.010532096147537232

step 6000

average time for data accessing is 0.0013532636165618896 average time for training is 0.010635793209075928

step 7000

average time for data accessing is 0.001356858253479004 average time for training is 0.010408684253692627

step 8000