Open yifanjiang19 opened 6 years ago
Thanks for reporting, I'll have a look into that
@yueruchen
Hi, I have the similar problem before. I think we need to check arguments passed to data loader (data.py) and also model (model.py) if necessary. Some arguments are pre-set manually. Hope it will help you. :)
It seems video length is hardcoded to 16 https://github.com/sergeytulyakov/mocogan/blob/937a99de73a3385fe0c0d1cde65dfa8a2c4177ec/src/train.py#L108
Though changing video length still results in an error like:
ValueError: Expected target size (32, 17), got torch.Size([32])
When increasing the video length (32 in this case), it seems that the difference+1 between the new video length and 16 is added to the next dimension.
Symptom occurs at https://github.com/sergeytulyakov/mocogan/blob/937a99de73a3385fe0c0d1cde65dfa8a2c4177ec/src/trainers.py#L211
video_length==16 (works): ('fake_categorical', (32, 102)) ('generated_categories', (32,))
video_length==32 (does not work): ('fake_categorical', (32, 102, 17)) ('generated_categories', (32,))
Update:
I've narrowed the issue down to the video_discriminator
The generated_categories seem correct, just the output shape of the video discriminator is adding another dimension that should not be there
Update 2:
I've got the architecture to train with video length 32 by increasing the stride in the VideoDiscriminator's last Conv3d layer. However, I'm almost positive this is not the correct approach and I run out of memory.
@sergeytulyakov did you test with any other video lengths? Either way, how do you recommend handling different video lengths?
I've got the same problem. Is there any new progress about this question?
Hello! I have some questions. When I set the image_batch, video_batch, and video_length both equal to 16, it works well. But when I set them to 48, it occurs:
Traceback (most recent call last): File "train.py", line 133, in
trainer.train(generator, image_discriminator, video_discriminator)
File "mocogan/src/trainers.py", line 273, in train
opt_generator)
File "mocogan/src/trainers.py", line 212, in train_generator
l_generator += self.category_criterion(fake_categorical.squeeze(), generated_categories)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/loss.py", line 601, in forward
self.ignore_index, self.reduce)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 1140, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, size_average, ignore_index, reduce)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 1053, in nll_loss
raise ValueError('Expected 2 or 4 dimensions (got {})'.format(dim))
ValueError: Expected 2 or 4 dimensions (got 3)
Do the batch size number and video length have a relation? Should I set them same? Thanks~