isl-org / StableViewSynthesis

MIT License
212 stars 34 forks source link

Multi GPU training #9

Closed FrankGoTo closed 3 years ago

FrankGoTo commented 3 years ago

Thanks for your grate work. I'm trying to retrain the network and it is successful in single GPU batch size=1. For accelerate the training process, I try to add "torch.nn.DataParallel" and increase the batch size.

But it raise error "RuntimeError: tgtidx.size(0) == 1 * nelems INTERNAL ASSERT FAILED at "generated/list_to_map_cuda.cpp":10, please report a bug to PyTorch. nelems of tgtidx does not match"

Does any one has the same problems? Looking forward for your reply. Thank you!