pre-extracted video embedding from joint space

antoine77340 / S3D_HowTo100M

S3D Text-Video model trained on HowTo100M using MIL-NCE

Apache License 2.0

189 stars 21 forks source link

pre-extracted video embedding from joint space #8

Closed YueYANG1996 closed 2 years ago

YueYANG1996 commented 3 years ago

Hi Antoine,

I am impressed by your excellent work which is very helpful to my research!

I would like to know if you have extracted the joint-space feature (512d) for all clips in Howto100M that I can directly download?

I have already downloaded the S3D features for all clips, is there any way to convert these feature vectors to the joint space?

Thank you very much!

Yue

antoine77340 commented 3 years ago

Hi,

Do you have the 1024 features dimension features? If yes that would be easy, conveting them to the 512 vector space only requires the last fully conntected layer (1024 -> 512). You can just apply the self.fc module (https://github.com/antoine77340/S3D_HowTo100M/blob/master/s3dg.py#L294) which will do the job.

YueYANG1996 commented 3 years ago

No, I don't. Where can I get the 1024 features? Thanks!

antoine77340 commented 3 years ago

I meant the 1024 dimensional representation, did you mean that you had them when you wrote: "I have already downloaded the S3D features for all clips, is there any way to convert these feature vectors to the joint space?"

Actually our HowTo100M server is down so the link to the feature is broken for the moment.

YueYANG1996 commented 3 years ago

Okay, got it. I have download the S3D features last year, I just checked the dimension of each file which is (seconds, 1024), so I just need to do average polling on this 2D vector and use the self.fc to get the joint space vector. Is that correct?

antoine77340 commented 3 years ago

Yes that's correct!

YueYANG1996 commented 3 years ago

Thanks, that's very helpful!

YueYANG1996 commented 3 years ago

Hi Antoine,

Thank you for your patient!

How can I compute the s3d feature from raw videos (same as the ones I downloaded from your website with dimension (seconds, 1024)) , is this repo the correct scripts (2D model) to extract features?

Yue

YueYANG1996 commented 3 years ago

Figured out how to extract the features(using the mixed_5c).

One technical question: when reading an entire video, the frame rate should be 10 FPS, right?

Thanks!

Yue