why don't agent balance learn slow on DDPG

Albert-Z-Guo / Deep-Reinforcement-Stock-Trading

A light-weight deep reinforcement learning framework for portfolio management. This project explores the possibility of applying deep reinforcement learning algorithms to stock trading in a highly modular and scalable framework.

GNU General Public License v3.0

593 stars 135 forks source link

Hi @chingchou888,

These are both good questions.

The reward of agent is defined at line 115-117 (and 107) in train.py. The reward is essentially the net unrealized (meaning the stocks are still in portfolio and not cashed out yet) profit evaluated at each action step made by the agent. In other words, we want to train the agent so that it can make profits. I'll add explanations on agents' learning mechanism in README.md later when I have time.

If you feel the learning rate of DDPG model is not large enough, you can adjust the learning rate hyperparameters (line 144 and 145 in DDPG.py). Note that DDPG model may converge faster (or not converge at all) with larger learning rates. And for hyperparameter tuning, there are a lot of good tools you can use (e.g. https://ray.readthedocs.io/en/latest/tune.html).

Let me know if you have other questions; and please phrase these questions more solidly so that others may refer as well. Thanks!

Albert-Z-Guo / Deep-Reinforcement-Stock-Trading

why don't agent balance learn slow on DDPG #3