Arena-Rosnav / arena-rosnav

MIT License
52 stars 34 forks source link

Training question(training for a long time with no effect) #201

Open zzzzzziyi opened 1 month ago

zzzzzziyi commented 1 month ago

Hello author ! Your project is very great and I'm very interested in your project! But when I try to train the model in this project, after two days of training at the workstation, the model success rate is 2% or 0%, less than 4%. I would like to know if this is normal. all the scripts I use are below. I hope you can help me. thank you so much.

My Environment: installation according to install Terminal 1: (arena-rosnav-py3.8) root@avocado:~/arena_ws/src/arena/arena-rosnav# python training/scripts/train_agent.py

Terminal 2: (arena-rosnav-py3.8) root@avocado:~/arena_ws# roslaunch arena_bringup start_training.launch model:=jackal num_envs:=1 map_folder_name:=map_empty

Terminal 3: (arena-rosnav-py3.8) root@avocado:~/arena_ws/src/arena/arena-rosnav# roslaunch arena_bringup visualization_training.launch ns:=sim_1

Training recording: Is the robot's irregular movement normal? Screencast from 06-08-2024 19:23:41.webm

Training screenshot: Screenshot from 2024-08-06 19-45-51

tuananhroman commented 1 month ago

Thank you very much for your interest! Glad to hear you could setup the project and are already testing it out. First of all, it is necessary to mention that there are many factors influencing the learnability of an agent. From what I can see, you should definitely reduce the batch_size (to 512-2048 is sufficient). Furthermore, it is recommended to utilize the parallelization capabilities of the simulation (set num_envs to the number of cpu's).

zzzzzziyi commented 1 month ago

Hi author, thank you for your prompt response! Once again, during the process of setting up and testing your project, I've been amazed by your work!

Regarding the two of your suggested solutions, I have tried them out by changing the batch_size to 512 and n_envs to 8 in training_config.yaml. But the training outcome after training for 24 hours seems to be undesirable. The success rate still remains around 0 or 2%. In addition, there is always a small agent rotating while the position of the robot agent remains unchanged. Is this normal? Lastly, is this training time sufficient?

Attached are the screen recording and screen shots of the outcome. Screencast from 13-08-2024 15:20:04.webm Screenshot from 2024-08-13 15-23-07 Screenshot from 2024-08-13 15-23-45

Would appreciate for your reply! Thank you in advance!