Open alfrentgen opened 3 years ago
https://github.com/pythonlessons/Reinforcement_Learning/blob/b5eedc73b946614c7a21634de9734dba961b6c91/LunarLander-v2_PPO/LunarLander-v2_PPO.py#L321
I've noticed that 'break' statement is missed at the end of the 'done' condition. It seems that the inner loop works infinitely without it as well as the replay experience grows. Found it during re-implementation of this tutorial in pytorch.
https://github.com/pythonlessons/Reinforcement_Learning/blob/b5eedc73b946614c7a21634de9734dba961b6c91/LunarLander-v2_PPO/LunarLander-v2_PPO.py#L321
I've noticed that 'break' statement is missed at the end of the 'done' condition. It seems that the inner loop works infinitely without it as well as the replay experience grows. Found it during re-implementation of this tutorial in pytorch.