octo-models / octo

Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
https://octo-models.github.io/
MIT License
885 stars 166 forks source link

About memory growth #121

Closed Toradus closed 3 months ago

Toradus commented 3 months ago

Crossposting this issue here, since it probably makes more sense in this repository:

There are several issues where people ask why the memory keeps growing when using the Dataloader: https://github.com/openvla/openvla/issues/4 https://github.com/octo-models/octo/issues/16

i kind of want to reopen the question with one assumption that maybe someone can verify: The restructuring of trajectories is done with tf's symbolic tensors. Since we randomly access samples of a trajectory (random sharding of tfds before accessing it) the data of a trajectory is not loaded sequentially. When loading data with history or future_window_size, the previous/next samples are also loaded due to the trajectory transform executing while accessing the sample.

Does TF cache these previous / next samples and reuses them once the corresponding sample is loaded? Could that be why the memory is growing?

I've noticed that the memory keeps growing till a specific point, then goes slightly up and down, which is kind of annoying when loading many different cameras. If that is the case, is there a way to disable this caching? And will the speed drop, even if we use prefetching?

Toradus commented 3 months ago

The Question was answered in the other repository, therefore closing this issue.