Add Robomimic Policy Training Support

octo-models / octo

Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.

https://octo-models.github.io/

MIT License

787 stars 152 forks source link

Add Robomimic Policy Training Support #57

Closed ashwin-balakrishna96 closed 6 months ago

ashwin-balakrishna96 commented 7 months ago

There are two main changes here:

1) Support to specify what trajectory keys to normalize, defaults to the original action and proprio keys. 2) When creating an interleaved dataset which mixes together multiple individual datasets, dataset statistics used for normalization are combined across all datasets.

kpertsch commented 6 months ago

By default I don't think we want to normalize interleaved datasets by their combined statistics -- I think this is a special case for DROID since the mixture is quite homogeneous. Instead, I would suggest adding an argument to make_interleaved_dataset that allows to overwrite the dataset_statistics and just gets passed through to make_rlds_dataset -- that seems more general.

ashwin-balakrishna96 commented 6 months ago

Yep makes sense, I can fix this later today.

ashwin-balakrishna96 commented 6 months ago

@kpertsch this is fixed now, let me know if there is anything else needed to merge. The default behavior should leave all normalization unchanged from what it was originally.

kpertsch commented 6 months ago

Moved most of the changes here to https://github.com/octo-models/octo/pull/62 (couldn't figure out how to modify this PR) -- closing this PR.