This PR has 3 goals that affects only users using multiple GPUs:
Be able to recover training from a saved network. Right now when loading the DPT back, all the modules are loaded on the default GPU and the mapping from gpuAssignments is wrong (see this comment by @Atcold)
Reduce the memory of the saved network by removing the clones of the networks
Be able to stop the training on a machine with n GPUs and start it back on a machine with m GPUs
This PR has 3 goals that affects only users using multiple GPUs:
gpuAssignments
is wrong (see this comment by @Atcold)