Closed sudo-homebrew closed 5 months ago
I have never trained a full model using the waffle pi, but in theory it should not make a big difference. What type of model are you training? (dqn/ddpg/td3). I would suggest starting with dqn as it is the simplest algorithm. Also use stage 1 and a simple reward function. Also, tweaking the most important hyperparameters such as the learning rate and batch size could make a difference Unfortunately, it is really hard to debug this sort of issue. Sometimes it simply takes a long time before the robot learns to stop rotating.
It doesn't seem like it's going to stop rotating even if I train more than 20k episodes at stage 1. It starts to rotate in one place from the beginning till more than 20k episodes even if I only changed the model to waffle pi and some model information such as linear, angular max speed and threshold goal and collision. I tried ddpg and td3 algorithms to train waffle pi with the same reward function as what you offered(for the burger model) but didn't work. I wonder what kind of strategy you used to make the current reward function and hyperparameters for burger model. If you have useful experience and strategies please share with us.
The examples I have included here show all of the hyperparameters and other configurations (stage, reward function) I have used for training successful models. You can find most of the information in the _hyperparams_20220808-030634.txt
file and in the name of the files. The ddpg example was trained on stage 9 for example.
I wonder if you ever tried training with waffle pi model not a burger. I am trying to train with waffle pi but even if I changed all the specifications on setting.py and util.py files, the training itself is running but seems not to be learning(i.e. rotating in one place, keep crashing on the wall and dynamic obstacles). Even if I modified the reward function refers to many other reward strategies but nothing changes. So I wonder if you have any experience with waffle pi model if so or if you have any idea to solve this problem, I want you to share it with me.