Tushar-N / pytorch-resnet3d

I3D Nonlocal ResNets in Pytorch
245 stars 39 forks source link

Is it possible to test the network with one video ? #5

Closed hasirk closed 5 years ago

Tushar-N commented 5 years ago

Yes, you can call the 3 functions in the__getitem__() function of the dataset class that prepares a list of files containing the video frames into a tensor for the network. They are:

# frame_list is the video's files ['frame1.jpg', 'frame2.jpg', ...]
frames = self.sample(frame_list) 
frames = self.clip_transform(frames) # (T, 3, 224, 224)
frames = frames.permute(1, 0, 2, 3) # (3, T, 224, 224)

Then just pass this to the network. In resnet.py forward_single(self, x) expects this tensor.

hasirk commented 5 years ago

@Tushar-N Thank you for helping out.

When I give the inputs as you suggested, I get an error like below.

File "eval.py", line 98, in test() File "eval.py", line 30, in test testFrames = net.forward_single(frames) File "/home/hk/Documents/pytorch-resnet3d/models/resnet.py", line 173, in forward_single x = self.conv1(x) File "/home/hk/anaconda3/envs/pytorch-build/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call result = self.forward(*input, **kwargs) File "/home/hk/anaconda3/envs/pytorch-build/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 476, in forward self.padding, self.dilation, self.groups) RuntimeError: Expected 5-dimensional input for 5-dimensional weight 64 3 5 7, but got 4-dimensional input of size [3, 32, 256, 256] instead

However with your suggestion I made following changes to make the frame_list readable. In eval.py removed the root testset = kinetics.Kinetics(root='', split='val', clip_len=32)

In kinetics.py remove / mark. self.loader = lambda fl: Image.open('%s%s'%(self.root, fl)).convert('RGB')