Closed FudiFudi closed 4 years ago
Hi, the only difference between training and evaluation is that while we sample from a stochastic policy during training, we only use the mean from this policy for taking actions during evaluation.
Link to relevant lines in the code: https://github.com/avisingh599/reward-learning-rl/blob/8070d93e9379204f153e9044e03079bd9a354183/softlearning/algorithms/rl_algorithm.py#L282
Thank you!
Hi. This is a question on the difference between training and evaluation.
In the normal RL simulations, there are training and evaluation. I know this two will have slight conditions difference to generalize the learning.
In the paper and code, I could not find the difference. So are there any conditions difference between training and evaluation in Visual Pusher task, Visual Door Opening task and Visual Picker task? If any, could you tell the difference and show me the part of code?
Regards,
FudiFudi