Open jsanket123 opened 3 years ago
That is still not on our roadmap, and I think that the probabilistic layers do not implement all the needed inferfaces to comply with Torch DataParallel. if you want to work on that, it would be much appreciated.
Hey thanks for getting back to me. I was able to implement my model in a parallelized way. It definitely takes lots of restructuring than what we normally do for one GPU. This is my current work so I will not be releasing it until I go through the review process for my paper.
I have been trying to use Bayesian linear regression example given by Blitz authors and parallelize their model by wrapping it with torch.nn.DataParallel. However, it seems that the given code is only using one gpu and not multiple gpus. Below is the same code from the bayesian_regression_boston.py example with model wrapped in DataParallel.
Below I provide the portion of the output. I print out the input dimension before calling the loss.module.sample_elbo and that should have the batch_size x no_variables (which is correctly printed out). However, inside the model's forward map it should print out the dimensions of each smaller batch taken by each of the GPU so it should have printed out 8 lines as 'In Model: input size torch.Size([2, 13])'. But apparently, it is only putting data on one GPU.
Could you please let me know what needs to be done for this to work on multiple GPUs? FYI: I use the methods suggested here to check whether multiple GPUs are actually being used for processing the input batch.