matinaghaei / Portfolio-Management-ActorCriticRL

Portfolio management using Actor-Critic Deep Reinforcement Learning algorithms including A2C, DDPG, and PPO
MIT License
37 stars 10 forks source link

DDPG - Why make the agent learn on testing phase ? #3

Closed khbu54efr5v14 closed 6 months ago

khbu54efr5v14 commented 1 year ago

Hello,

During the testing phase of the DDPG algo you made the agent learn from testset.

Many thanks for taking the time to respond me

matinaghaei commented 1 year ago

Hi,

The purpose of the testing phase is to simulate a real scenario and see how the algorithm performs. Now, if you have a trading algorithm that is extensively trained on the market data of, for example, the past 10 years, you still let it train on the new data that is revealed on a daily basis, right? That’s why the algorithm still continues to train in the testing phase, but it’s only for one pass, unlike the training phase where it trains for multiple passes.

The purpose of the validation phase is only to see if the algorithm is overfitting the training data, not how it actually performs on the validation data in a real scenario, so I thought it was not necessary to continue training on the validation data itself, but I could do it as well.

Let me know if it makes sense to you!

khbu54efr5v14 commented 1 year ago

Thank you for your explanations, this was very clear !