Open Xinheng-He opened 1 week ago
I made it by adding a module in trainind_step like this, however, when I run the code on multiple GPUs, it stops when return None (no matter clean or not), I think maybe such trick can only be played on single GPU training. Wish it helps for others.
Hi developers:
Hydra-lightning is a really cool tool and I like it! However, my batch includes highly different size of graphs and sometimes it causes OOM issues. Previously I would manually skip this batch but in hydra-lightning it seems hard to do this. Is it possible to add it in a future version, or how can I skip batch when such batch OOM (out of memory in GPU)?
Xinheng