`ddqn.experience`'s `action` data type problem in `22_deep_reinforcement_learning`/`04_q_learning_for_trading.ipynb`

stefan-jansen / machine-learning-for-trading

Code for Machine Learning for Algorithmic Trading, 2nd edition.

12.57k stars 4.03k forks source link

Describe the bug

When the train goes on for hundreds of episodes, it randomly generates a np.ndarray data type object in the ddqn.experience.action list. At very beginning, the interval is once per episode. Then, it becomes once per every 20 steps, and finally every step.

To Reproduce

Run the training for like 300 episodes.

Execute the below code, which will show a list of nth ddqn.experience whose action is not an integer.

all_action_list = [ddqn.experience[i][1] for i in range(len(ddqn.experience))]
for i in range(len(all_action_list)):
if type(all_action_list[i]) != int:
    print(i)

Question

Is this intended? Any purpose for this?

stefan-jansen / machine-learning-for-trading

`ddqn.experience`'s `action` data type problem in `22_deep_reinforcement_learning`/`04_q_learning_for_trading.ipynb` #271