eleurent / rl-agents

Implementations of Reinforcement Learning and Planning algorithms
MIT License
591 stars 153 forks source link

how to test the training data #40

Closed Wasedarocket closed 3 years ago

Wasedarocket commented 4 years ago

Hello, I just start the reinforcement learning decision recently, this is an excellent project for me. But I have some questions when I run this project. 1 I train the data use this command: python experiments.py evaluate configs/HighwayEnv/env.json configs/HighwayEnv/agents/DQNAgent/ego_attention.json --train --episodes=4000 --name-from-config. after training the data successfully. Where are the deep network parameters saved? saved_models/lastest.tar or in ego_attention_20200514-201107_29818 folder (which .xx file?) 2 I train agent ego_attention.json, dqn.json and ddqn.json and env.json together in different terminal. After that I run command: python experiments.py evaluate configs/HighwayEnv/env.json configs/HighwayEnv/agents/DQNAgent/ego_attention.json --test --episodes=10 --recover-from=out/IntersectionEnv/DQNAgent/saved_models/latest.tar. The results are strange that ego-vehicle cash other cars all the time and its attention is always on one car even it passed. So only one deep networks weight parameters after training can be saved? or all networks parameters after training can be saved in different files or same file like lastest.tar? Thank you very much.

eleurent commented 4 years ago

Hi @Wasedarocket, During training, several models are saved in the run directory (out/Env/DQNAgent/run_xxxxxx/yyyyyyy.tar):

But also, whenever a model is saved (any of the above), a copy is also saved in out/Env/DQNAgent/saved_models/latest.tar. This is mainly useful to quickly test the latest model from the latest run without having to specify a model path, by using experiments.py env.json agent.json --test. Of course, using '--recover-from' is preferable if you have many trained models.

Note that importantly, the agent.json configuration that you use when testing a model must be the same as the one you used during training (only the model parameters are saved, not the architecture). Be careful of that if you train dqn.json in parallel with ego_attention.json, for instance.

Wasedarocket commented 4 years ago

Thank you for your reply. I will try to train again. Thank you