Closed sun-te closed 4 years ago
In theory this is possible, but I did not implement it so far. All code that seems to be helpful for multi-gpu has been used to train one model in a model-parallel fashion This is not strictly necessary for the model described in the paper, but can be useful for GPUs with not that much memory.
Is it possible to train on multi-gpus? Thanks!