Open srogatch opened 1 year ago
Hello sorry for the delay. We do currently have Docker containers which you can use with Wandb to perform a distributed hyper-parameter sweep. IMO multi-GPU for a single model isn't much benefit it is very hard to saturate even a single GPU unless you have huge batch sizes. Bottleneck generally comes from things.
I have batch size 64, history length 1440, lookahead 480, and 2 million points in the time series, each consisting of 4 values. A single GPU is saturated 97-100% currently, and judging from power consumption, it's indeed fully saturated and I can benefit from multiple GPUs.
Interesting, I've never really run into that problem before. Let me look into it. FF is built on top of PyTorch of course so it is hopefully it is something I could reasonably add quickly. Out of the box as of now though we don't support it as we mainly use model.to()
Yes, we need to add DistributedDataParallel
object, multi-processing launch, get the local rank of each process, and use it as the device parameter in model.to()
. I planned to add this myself, but unfortunately, afterwards I had to postpone this project because I got some higher priorities.
I couldn't so far find a way to train on multiple GPUs within the same computer. If it exists, please, describe the way to do it.