In the training plot which is generated for every xxx episode in optimization and training the 4 different loss types have very different scales. In order to give information this plot should be splitter in 2 with different scale on the y-axis.
Can we narrow down which agents are affected by this problem? The TD3 is not, because the actor's and the critic's loss have a similar scale.
I think the PPO has that issue. Else?
In the training plot which is generated for every xxx episode in optimization and training the 4 different loss types have very different scales. In order to give information this plot should be splitter in 2 with different scale on the y-axis.