Closed Ethan0207 closed 2 months ago
You mean the size of it? One square is 1x1 meters, so you could calculate based on that.
OK, thank you for your reply.
Hello, I have another problem. In this project, what is the meaning of step, episode or epoch?
Hi, Thank you for your reply. Maybe I hasn't understood the meaning of episode. In part3, it indicates that this is a collection of subsequent steps, until one of the termination conditions is reached. But when I train the agent, an episode is updated approximately every 20 minutes. And we can see the agent can reach the termination quickly. Could you explain it?
Hi,
That would not be an episode. Episode is exactly that, a collection of steps until either a crash, reaching a goal or reaching maximum episode step number. What you are observing is the end of an epoch and the starting of the evaluation cycle. One epoch is bunch of episodes between evaluations.
Hi,
Thank you very much. I know what you mean. Just like this sentence, is that means this is the 62th epoch, and the average reward is -82.113001? But what is the meaning of the "0.900000" and how do we know how many episodes it has gone through?
That is the average reward during the 10 evaluation runs after epoch 62. 0.9 is the collision rate. This would be quite easy to tell from the code: https://github.com/reiniscimurs/DRL-robot-navigation/blob/main/TD3/train_velodyne_td3.py#L32-L37
I would suggest to fully familiarize yourself with the code. This will help understanding what is in this repo.
Thank you very much. I've understood.
Hello, It's me again. I have a problem about the project. What is the area of the training environment?![1 18](https://github.com/reiniscimurs/DRL-robot-navigation/assets/138771150/655840c0-c8e5-42aa-b8bb-b0aafd678b16)