some questions about sim2real

reiniscimurs / DRL-robot-navigation

Deep Reinforcement Learning for mobile robot navigation in ROS Gazebo simulator. Using Twin Delayed Deep Deterministic Policy Gradient (TD3) neural network, a robot learns to navigate to a random goal point in a simulated environment while avoiding obstacles.

MIT License

569 stars 119 forks source link

some questions about sim2real #119

Closed nlnlnl1 closed 5 months ago

nlnlnl1 commented 7 months ago

Hi. When I test the weights trained in the virtual environment on a real car. My real car has been spinning around since moving forward a distance, my real car can avoid obstacles but cannot reach the target point. I can confirm that the input of the real car is aligned with the input in the virtual environment., But the real testing environment of the car is very different from the virtual environment. I'm wondering if the spatial structure of the real environment should be similar to that of the virtual environment. Or do you have any other experience related to Sim2real? Thank you!

reiniscimurs commented 7 months ago

Hi,

The approach does not learn the spatial structure so in real deployment the environment does not need to be similar to the learning environment. However, there can be states in real deployment that are out of distribution for the model and scenes that it did not encounter during training. Consider a situation where there are no obstacles in the scene. This is not a situation that will ever happen in the training state, but might happen in real life. So this is an out of distribution state that the model is not trained on. You also have to see what the max reading of your real sensor is and whether that can cause issues.

nlnlnl1 commented 7 months ago

Okay, thank you for your answer. So, may I ask if you tested in a real environment on a map similar to a virtual environment? Or rather, I haven't had enough time to train the model yet, but during my training process, all indicators have reached their peak and tested well in a virtual environment. I'm not sure if extending the training time will be effective in this situation.

reiniscimurs commented 6 months ago

Hi,

You can see the environments here: https://www.youtube.com/watch?v=MhuhsSdzZFk&ab_channel=ReinisCimurs They were very different than the training environment at test time. Most likely there is some discrepancy between real sensors/robot/environment/transforms/etc. and the ones in the simulation. More training will probably not help. Make sure that the data is displayed the same way as in training and that goals and other values are given in the same range as in training.