rail-berkeley / rlkit

Collection of reinforcement learning algorithms
MIT License
2.45k stars 550 forks source link

General understanding: Purpose of evaluation in training #82

Closed HeinzBenjamin closed 4 years ago

HeinzBenjamin commented 4 years ago

Hi and thanks a lot for rlkit, it's really useful. I'm using the sac and td3 modules atm. Upon investigating your code I wondered if you could clarify to me a bit the precise purpose of the eval loops? I just want to be sure that I understood the code correctly. As I understood, training in batch_rl_algorithm is split into an evaluation and an exploration subroutine. Training steps are performed on the replay buffer that is filled only from exploration steps but not from evaluation steps. So I wondered if the eval step serves a function in training? Or is it "just" meant for logging and recording the progress of learning? Hope I didn't miss some very obvious points in your code.

Best and thanks again Ben

vitchyr commented 4 years ago

I'm glad to hear that rlkit is useful! You're correct that it's just for logging and recording progress. Feel free to reopen this issue if you have more questions.