HHTseng / video-classification

Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
940 stars 216 forks source link

Issue in Conv3D/UCF101_3DCNN.py #7

Closed amihanpour closed 5 years ago

amihanpour commented 5 years ago

Thanks for your wonderful code. I try to repeat your work, but there are some problems when I try it. I did not change your code except for the necessary place, Could you please help me to fix this problem? Thank you so much. The problem is showed as blow: RuntimeError Traceback (most recent call last) in () 2 for epoch in range(epochs): 3 # train, test model ----> 4 train_losses, train_scores = train(log_interval, cnn3d, device, train_loader, optimizer, epoch) 5 epoch_test_loss, epoch_test_score = validation(cnn3d, device, optimizer, valid_loader) 6

in train(log_interval, model, device, train_loader, optimizer, epoch) 13 14 optimizer.zero_grad() ---> 15 output = model(X) # output size = (batch, number of classes) 16 17 loss = F.cross_entropy(output, y)

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py in call(self, *input, kwargs) 487 result = self._slow_forward(*input, *kwargs) 488 else: --> 489 result = self.forward(input, kwargs) 490 for hook in self._forward_hooks.values(): 491 hook_result = hook(self, input, result)

~\Documents\Python Scripts\video-classification-master\Conv3D\functions.py in forward(self, x_3d) 202 # FC 1 and 2 203 x = x.view(x.size(0), -1) --> 204 x = F.relu(self.fc1(x)) 205 x = F.relu(self.fc2(x)) 206 x = F.dropout(x, p=self.drop_p, training=self.training)

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py in call(self, *input, kwargs) 487 result = self._slow_forward(*input, *kwargs) 488 else: --> 489 result = self.forward(input, kwargs) 490 for hook in self._forward_hooks.values(): 491 hook_result = hook(self, input, result)

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\linear.py in forward(self, input) 65 @weak_script_method 66 def forward(self, input): ---> 67 return F.linear(input, self.weight, self.bias) 68 69 def extra_repr(self):

C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\functional.py in linear(input, weight, bias) 1350 if input.dim() == 2 and bias is not None: 1351 # fused op is marginally faster -> 1352 ret = torch.addmm(torch.jit._unwrap_optional(bias), input, weight.t()) 1353 else: 1354 output = input.matmul(weight.t())

RuntimeError: size mismatch, m1: [1 x 47040], m2: [1249920 x 256] at c:\a\w\1\s\tmp_conda_3.6_091443\conda\conda-bld\pytorch_1544087948354\work\aten\src\thc\generic/THCTensorMathBlas.cu:266

HHTseng commented 5 years ago

Sure, may I ask what kind of changes you did? so that I may try to reproduce the error. And which version of Pytorch did you use? Thanks!

amihanpour commented 5 years ago

This error is well resolved. I mistakenly changed a part of the code. But I found another one. I use pytorch version 1.0.0 and my os is win10.

To make the code run faster, I made the following changes: k = 2 # number of target category epochs = 1 batch_size = 3 action_names=['ApplyEyeMakeup', 'BodyWeightSquats'] transform = transforms.Compose([transforms.Resize([img_x, img_y]), transforms.ToTensor(), transforms.Normalize(mean=[0.485], std=[0.229])]) But when I execute the code, the Error will appear as below"

RuntimeError Traceback (most recent call last)

in () 3 4 cnn3d = CNN3D(t_dim=len(selected_frames), img_x=img_x, img_y=img_y, ----> 5 drop_p=dropout, fc_hidden1=fc_hidden1, fc_hidden2=fc_hidden2, num_classes=k).to(device) 6 7 C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py in to(self, *args, **kwargs) 379 return t.to(device, dtype if t.is_floating_point() else None, non_blocking) 380 --> 381 return self._apply(convert) 382 383 def register_backward_hook(self, hook): C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py in _apply(self, fn) 185 def _apply(self, fn): 186 for module in self.children(): --> 187 module._apply(fn) 188 189 for param in self._parameters.values(): C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py in _apply(self, fn) 191 # Tensors stored in modules are graph leaves, and we don't 192 # want to create copy nodes, so we have to unpack the data. --> 193 param.data = fn(param.data) 194 if param._grad is not None: 195 param._grad.data = fn(param._grad.data) C:\ProgramData\Anaconda3\lib\site-packages\torch\nn\modules\module.py in convert(t) 377 378 def convert(t): --> 379 return t.to(device, dtype if t.is_floating_point() else None, non_blocking) 380 381 return self._apply(convert) RuntimeError: CUDA out of memory. Tried to allocate 1.19 GiB (GPU 0; 4.00 GiB total capacity; 2.40 GiB already allocated; 552.35 MiB free; 429.50 KiB cached)
HHTseng commented 5 years ago

There are two things I am concerned: (1) I see batch_size=3. I noticed that very often if the batch size is very small and the model contains batch normalization layer, it easily goes wrong. (maybe a problem of Pytorch). My suggestion: remove the batch normalization layer temporarily if you wanna keep using low batch size. (2) Conv3D is demanding on GPU resource (compared to CRNN/ResNetCRNN) from my observation. And from your last error message:

RuntimeError: CUDA out of memory. Tried to allocate 1.19 GiB (GPU 0; 4.00 GiB total capacity; 2.40 GiB already allocated; 552.35 MiB free; 429.50 KiB cached)

it seems that your CUDA (GPU) memory is not sufficient to run the code setting. You may wanna reduce model parameters or input data size, eg. change the following for smaller frames:

begin_frame, end_frame, skip_frame = 1, 29, 1

please let me know if you are able to solve the error. Thanks!

amihanpour commented 5 years ago

I did the following changes and Error 'CUDA out of memory' was fixed.

fc_hidden1, fc_hidden2 = 128, 128 k = 101 batch_size = 30 begin_frame, end_frame, skip_frame = 1, 10, 1

thank you so much.

amihanpour commented 5 years ago

Do you have an article about video classification on this code?

HHTseng commented 5 years ago

Good to hear the problem is solved! Actually I don't have my own paper for the code, I wrote the code out of own interests. But this article should be close, only that I did not check into details.