SudeepDasari / data4robotics

MIT License
56 stars 9 forks source link

chunk h=100 #8

Closed LanrenzzzZ closed 2 days ago

LanrenzzzZ commented 1 week ago

While training the DiT-Block Policy models to predict sequences of H = 100 actions, this approach served as regularization during training and allowed us to use temporal ensembling to enhance stability during runtime. Following the parameters described in the paper, experiments conducted in the aloha environment of lerobot seem to yield suboptimal results. If temporal ensembling is not employed, there is severe jitter in action execution. Have you experienced similar issues during your training processes? Thank you!

SudeepDasari commented 2 days ago

It's a bit hard to comment, since so many of these things can vary task to task. But I've definitely noticed jitter before that goes away with temporal ensembling. I would strongly recommend you tune the alpha parameter in ensembling if you employ it.

Also, if your val metrics are super high, it may mean you need to tune the training hyperparameters (e.g., ac_chunk, lr schedule, pre-trained parameters, etc.) a bit. Another idea is to collect more training data, or improve your demo dataset's quality (e.g., more diversity in demos).