sergeytulyakov / mocogan

MoCoGAN: Decomposing Motion and Content for Video Generation
578 stars 114 forks source link

Problem about batch size and video length #7

Open yifanjiang19 opened 6 years ago

yifanjiang19 commented 6 years ago

Hello! I have some questions. When I set the image_batch, video_batch, and video_length both equal to 16, it works well. But when I set them to 48, it occurs:

Traceback (most recent call last): File "train.py", line 133, in trainer.train(generator, image_discriminator, video_discriminator) File "mocogan/src/trainers.py", line 273, in train opt_generator) File "mocogan/src/trainers.py", line 212, in train_generator l_generator += self.category_criterion(fake_categorical.squeeze(), generated_categories) File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 325, in call result = self.forward(*input, **kwargs) File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/loss.py", line 601, in forward self.ignore_index, self.reduce) File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 1140, in cross_entropy return nll_loss(log_softmax(input, 1), target, weight, size_average, ignore_index, reduce) File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 1053, in nll_loss raise ValueError('Expected 2 or 4 dimensions (got {})'.format(dim)) ValueError: Expected 2 or 4 dimensions (got 3)

Do the batch size number and video length have a relation? Should I set them same? Thanks~

sergeytulyakov commented 6 years ago

Thanks for reporting, I'll have a look into that

bragilee commented 5 years ago

@yueruchen

Hi, I have the similar problem before. I think we need to check arguments passed to data loader (data.py) and also model (model.py) if necessary. Some arguments are pre-set manually. Hope it will help you. :)

Cpruce commented 5 years ago

It seems video length is hardcoded to 16 https://github.com/sergeytulyakov/mocogan/blob/937a99de73a3385fe0c0d1cde65dfa8a2c4177ec/src/train.py#L108

Though changing video length still results in an error like: ValueError: Expected target size (32, 17), got torch.Size([32])

Cpruce commented 5 years ago

When increasing the video length (32 in this case), it seems that the difference+1 between the new video length and 16 is added to the next dimension.

Symptom occurs at https://github.com/sergeytulyakov/mocogan/blob/937a99de73a3385fe0c0d1cde65dfa8a2c4177ec/src/trainers.py#L211

video_length==16 (works): ('fake_categorical', (32, 102)) ('generated_categories', (32,))

video_length==32 (does not work): ('fake_categorical', (32, 102, 17)) ('generated_categories', (32,))

Update:

I've narrowed the issue down to the video_discriminator

Screen Shot 2019-04-21 at 8 02 31 PM

The generated_categories seem correct, just the output shape of the video discriminator is adding another dimension that should not be there

Update 2:

I've got the architecture to train with video length 32 by increasing the stride in the VideoDiscriminator's last Conv3d layer. However, I'm almost positive this is not the correct approach and I run out of memory.

Screen Shot 2019-04-21 at 8 43 53 PM

@sergeytulyakov did you test with any other video lengths? Either way, how do you recommend handling different video lengths?

HeegerGao commented 3 years ago

I've got the same problem. Is there any new progress about this question?