Closed majid5776 closed 1 month ago
The evaluation reward should be proportional to the training reward. If it is going up it is great! It means your model is learning well! As far as I know, there is no notion of overfitting in RL as the task you are evaluating and training on is the task you want to solve
thank you. but after 20 evaluation my agents don't work well on my videos. but as I said my evaluation reward is increasing. does it mean I should increase the num of iterations to converge the evaluation reward?
This might be due to how you create your reward functions. There might be conflicting objectives. In general, i cannot provide feedback on custom tasks
hi. what does it mean when in a scenario eval_mean_reward increasing also critic loss increasing? can we call it over-fitting?