Clean save and load of the DataParallelTable

soumith / imagenet-multiGPU.torch

an imagenet example in torch.

BSD 2-Clause "Simplified" License

401 stars 158 forks source link

Closed albanD closed 8 years ago

albanD commented 8 years ago

This PR has 3 goals that affects only users using multiple GPUs:

Be able to recover training from a saved network. Right now when loading the DPT back, all the modules are loaded on the default GPU and the mapping from gpuAssignments is wrong (see this comment by @Atcold)
Reduce the memory of the saved network by removing the clones of the networks
Be able to stop the training on a machine with n GPUs and start it back on a machine with m GPUs

soumith commented 8 years ago

awesome PR, much needed. Will fix it properly in the nn package itself in the coming few days (by introducing a clearBuffers function)

Atcold commented 8 years ago

Sweet!