question about train and development mode

kengz / SLM-Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

https://slm-lab.gitbook.io/slm-lab/

MIT License

1.25k stars 264 forks source link

question about train and development mode #376

Closed qazwsx74269 closed 5 years ago

qazwsx74269 commented 5 years ago

what on earth is the difference between train mode and dev mode? According to your API, dev mode is just train mode with shorter episodes. But in terms of what I have learnt about, dev mode doesn't involve the update of model parameters. So I am quite confused about this and hope someone can help me with this.

kengz commented 5 years ago

Hi, dev mode is simply train mode with rendering of environment and network parameter update check. Because these two operations are expensive, it will also run slower, and is used only for dev.

qazwsx74269 commented 5 years ago

Hi, dev mode is simply train mode with rendering of environment and network parameter update check. Because these two operations are expensive, it will also run slower, and is used only for dev.

So doesn't RL involve the validation procedure? It will find the best policy during training procedure and we just use the policy on the test dataset to evaluate its generalization ability. Is it?

kengz commented 5 years ago

RL validation/eval is different from supervised learning. It is done online as the policy/network/agent iterates, so evaluation is ran at checkpoints at regular intervals. See https://github.com/kengz/SLM-Lab/blob/master/slm_lab/experiment/control.py#L69-L82 Also note that in RL the training data is also the test data. This is still an area of research.