knagrecha / hydra

Execution framework for multi-task model parallelism. Enables the training of arbitrarily large models with a single GPU, with linear speedups for multi-gpu multi-task execution.
Apache License 2.0
20 stars 3 forks source link

Merge saved_intermediates branch with main branch with a flag to allow users to switch between execution modes. #3

Closed knagrecha closed 1 year ago

knagrecha commented 2 years ago

Just realized saved_intermediates always restores to the original GPU. This explains our OOM's. Need to investigate further for a solution.