Closed DaniAffCH closed 7 months ago
@DaniAffCH How big is your shared memory? It might be too small, see this issue: https://github.com/lukashermann/hulc/issues/8#issuecomment-1410095454
Does it work when you don't use the shared memory dataloader (by setting datamodule/datasets=vision_lang
)?
Did you try running the dataset task_D_D
, which is ~4 times smaller than task_ABCD_D
?
You are right, indeed everything works using without using the shm. Thanks for your support!
You're welcome, just note that not using the shm dataloader will slow down training times by a factor of 1.3 or 1.4
If I run the training on
calvin_debug_dataset
, everything works fine but if I use the real datasettask_ABCD_D
the training crases after completing the initial shared memory loading. This is the stack trace: