PaddlePaddle / PaddleGAN

PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, Wav2Lip, picture repair, image editing, photo2cartoon, image style transfer, GPEN, and so on.
Apache License 2.0
7.9k stars 1.24k forks source link

AssertionError: This model does not have any parameters to train, and does not need to use DataParallel #762

Open woodtosoil opened 1 year ago

woodtosoil commented 1 year ago

您好,我在训练paddlegan里的fomm时,单卡运行可以,多卡并行会有报错,麻烦帮忙看下是哪里出了问题

woodtosoil commented 1 year ago

W0306 09:00:02.521772 1088 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 11.2, Runtime API Version: 10.1 W0306 09:00:02.530787 1088 gpu_resources.cc:91] device: 0, cuDNN Version: 7.6. [03/06 09:00:03] ppgan INFO: Found /home/dujw/.cache/ppgan/vgg19.pdparams just train kp_detector, fix generator [03/06 09:00:07] ppgan INFO: Found /home/dujw/.cache/ppgan/vgg19.pdparams [03/06 09:00:12] ppgan INFO: Found /home/dujw/.cache/ppgan/vox-cpk.pdparams load pretrained generator... I0306 09:00:15.229094 1088 nccl_context.cc:83] init nccl context nranks: 4 local rank: 0 gpu id: 0 ring id: 0 Traceback (most recent call last): File "tools/main.py", line 56, in main(args, cfg) File "tools/main.py", line 32, in main trainer = Trainer(cfg) File "/data2/djw/PaddleGAN/ppgan/engine/trainer.py", line 140, in init self.distributed_data_parallel() File "/data2/djw/PaddleGAN/ppgan/engine/trainer.py", line 196, in distributed_data_parallel net, find_unused_parameters=find_unused_parameters) File "/data/djw/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/dygraph/parallel.py", line 629, in init self.init_reducer() File "/data/djw/anaconda3/envs/paddle/lib/python3.7/site-packages/paddle/fluid/dygraph/parallel.py", line 653, in init_reducer "This model does not have any parameters to train, and " \ AssertionError: This model does not have any parameters to train, and does not need to use DataParallel LAUNCH INFO 2023-03-06 09:00:16,759 Exit code 1

LokeZhou commented 8 months ago

请问问题是的得到解决