shunsukesaito / SCANimate

This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"
https://scanimate.is.tue.mpg.de/
Other
273 stars 31 forks source link

GPU #10

Closed HDYYZDN closed 2 years ago

HDYYZDN commented 2 years ago

想请问一下,如果设置多GPU 并行处理? 在代码中设置: os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" os.environ['CUDA_VISIBLE_DEVICES'] = "4,6"; 开头: cuda = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 中间: if torch.cuda.device_count() > 1: print("Let's use", torch.cuda.device_count(), "GPUs!") model = nn.DataParallel(model); model = model.module model.to(cuda); else: model.to(cuda);

但是不起作用!

JinlongYANG commented 2 years ago

不好意思,这个release版本的代码并没有考虑多GPU并行处理。 所以要改成多GPU,估计有很多小地方需要改动。 另外,以上给出的改动也是有问题的,按照上面的代码,model只会被放到cuda:0上,而cuda:0又不存在,只有4和6。建议多参考一下pytorch的multi-GPU教程。

Unfortunately, we didn't consider multi-GPU training in this released version. Thus there would be many small things one probably needs to change to make it work with DP, and more to change with DDP. We currently don't have the bandwidth to look into it. By the way, the code provided above won't run anyway. The model is going to be placed on cuda:0 but visible devices are specified as 4 and 6. Please refer to PyTorch data-parallel tutorial for detailed guidance.

If you want to speed up training. There is another thing you can do apart from multi-GPU processing, and it may give you x2 acceleration. If you run the code on high-resolution scans, you'll find the mesh sampling code (line 139 and line 213 in lib/ext_trimesh.py) takes considerable time. As an alternative, one can sample the mesh surface offline and save it somewhere. And during training, simply load the pre-sampled points instead of sampling on the fly.