SydCaption / SAAT

MIT License
62 stars 21 forks source link

How to use the 2d and 3d features extracted by ./misc code in SAAT? #7

Closed AcodeC closed 3 years ago

AcodeC commented 4 years ago

I notice that you use the original feature given by cst in your paper. And you use c3d in h5 format and mean pooling is used to get video level feature. But in your misc folder, you extract features in .npy format. So how to use the 2d and 3d features extracted by ./misc code in SAAT? I wonder wheather many codes should be changed to use the new feature and how to use it?

SydCaption commented 4 years ago

That's right. I firstly save features in .npy files and then generate the .h5 file. You can also directly save features into a .h5 file. Or based on the code in ./misc, you can do something like

f = h5py.File('msvd_train_c3d_mp1.h5', 'w')
for i in idx:
    feat = np.load(os.path.join(feat_path, dir_name, vid_name+'.npy'))
    f.create_dataset(str(i), data=feat.mean(0), dtype='f8')
f.close()
AcodeC commented 4 years ago

That's right. I firstly save features in .npy files and then generate the .h5 file. You can also directly save features into a .h5 file. Or based on the code in ./misc, you can do something like

f = h5py.File('msvd_train_c3d_mp1.h5', 'w')
for i in idx:
    feat = np.load(os.path.join(feat_path, dir_name, vid_name+'.npy'))
    f.create_dataset(str(i), data=feat.mean(0), dtype='f8')
f.close()

Traceback (most recent call last): File "extract_feats_motion.py", line 90, in extract_feats(opt.file_path, opt.dataset_name, model, namelist[opt.start_idx:opt.end_idx], opt.frame_per_video, opt.batch_size, save_path) File "extract_feats_motion.py", line 49, in extract_feats out = net(curr_batch.transpose(1,2).cuda()) File "/home/acodec/anaconda2/envs/python3_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, kwargs) File "/home/acodec/anaconda2/envs/python3_pytorch/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward return self.module(*inputs[0], *kwargs[0]) File "/home/acodec/anaconda2/envs/python3_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(input, kwargs) File "/media/acodec/media/Part Two/SAAT-master/3D-ResNets-PyTorch/models/resnext.py", line 159, in forward x = self.conv1(x) File "/home/acodec/anaconda2/envs/python3_pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, **kwargs) File "/home/acodec/anaconda2/envs/python3_pytorch/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 480, in forward self.padding, self.dilation, self.groups) RuntimeError: Calculated padded input size per channel: (6 x 118 x 118). Kernel size: (7 x 7 x 7). Kernel size can't be greater than actual input size


I meet an error. But I do not konw which code is wrong. RuntimeError: Calculated padded input size per channel: (6 x 118 x 118). Kernel size: (7 x 7 x 7). Kernel size can't be greater than actual input size.)

BBBoundary commented 4 years ago

hello, we also met this problem when extracting 3D features. RuntimeError: Calculated padded input size per channel: (6 x 118 x 118). Kernel size: (7 x 7 x 7). Kernel size can't be greater than actual input size Have you solved this problem?

ydwl-lynn commented 3 years ago

I notice that you use the original feature given by cst in your paper. And you use c3d in h5 format and mean pooling is used to get video level feature. But in your misc folder, you extract features in .npy format. So how to use the 2d and 3d features extracted by ./misc code in SAAT? I wonder wheather many codes should be changed to use the new feature and how to use it?

I also want to try to mention the features by myself. Have you solved this problem?