SydCaption / SAAT

MIT License
62 stars 21 forks source link

Is it right to use feature extraction code for 3D motion (3D CNN) ? #8

Closed dcahn12 closed 4 years ago

dcahn12 commented 4 years ago

Hi, I have a question about feature extraction for 3D motion of custom dataset. With your default setting in the code(/misc/extract_feats_motion.py), the shape of output feature should be 400 (n_classes). But, when I checked the shape of the 3D motion feature you linked, it was 4096 for msrvtt and 2048 for yt2t. It is quite different. Could you explain more details about regarding this?

SydCaption commented 4 years ago

Since the feature for msrvtt is available without extracting code and pretrained model, we directly use it for training and testing. We extract c3d features for yt2t with pretrained models of 3D-ResNets-Pytorch, from the layer before the final fc-layer. Actually, you can use other c3d features as well, e.g. I3D, we didn't try different c3d features to choose the best one. @dcahn12

dcahn12 commented 4 years ago

Thanks a lot!