Closed ll7 closed 1 month ago
Currently, the robot creates many small actions. This is unrealistic and energy consuming. A new reward should be designed to punish many changes in the picked action.
A new model is trained, but the performance is not optimal: I don't know why the collision rates aren't better.
Currently, the robot creates many small actions. This is unrealistic and energy consuming. A new reward should be designed to punish many changes in the picked action.