NVIDIA / vid2vid

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.
Other
8.57k stars 1.2k forks source link

Bug in calculating n_gpus causing training to crash #95

Open tharindu-mathew opened 5 years ago

tharindu-mathew commented 5 years ago

I'm running the pose script (./scripts/pose/train_256.sh). This seems to be crashing due to incorrectly calculating n_gpus.

The particular line at fault is:

./models/vid2vid_model_G.py:
self.n_gpus = self.opt.n_gpus_gen // self.opt.batchSize # number of gpus for running generator

When batchSize is less than number of generators this becomes 0, which does not seem like correct behavior. I wonder what the correct fix for this is?

Traceback (most recent call last): File "train.py", line 329, in train() File "train.py", line 36, in train modelG, modelD, flowNet = create_model(opt) File "/scratch2/mathewc/vid2vid/models/models.py", line 19, in create_model modelG.initialize(opt) File "/scratch2/mathewc/vid2vid/models/vid2vid_model_G.py", line 61, in initialize self.n_frames_per_gpu = min(self.opt.max_frames_per_gpu, self.opt.n_frames_total // self.n_gpus) # number of frames in each GPU ZeroDivisionError: integer division or modulo by zero

AIprogrammer commented 4 years ago

Hi, I am also confused with the code below: n_gpus = opt.n_gpus_gen if opt.batchSize == 1 else 1 # number of gpus used for generator for each batch

I don't know why n_gpus should be 1 when batchsize > 1.