Strange outputs when running dcgan example

When I ran the dcgan.py in examples(autoscale batch size off), I found the outputs very strange and did not tend to converge:

But when I remove the following two rows:

netD = adl.AdaptiveDataParallel(netD, optimizerD, scheduleD, name="netD")
netG = adl.AdaptiveDataParallel(netG, optimizerG, scheduleG, name="netG")

The results seem better: 1655875134833

Could you please help me solve this? Is this may be caused by the warning related to the zero_grad?

petuum / adaptdl