UARK-AICV / VLCAP

[ICIP 2022] VLCap: Vision-Language with Contrastive Learning for Coherent Video Paragraph Captioning
https://ieeexplore.ieee.org/document/9897766
28 stars 5 forks source link

clip image feature #12

Closed gdg452 closed 1 year ago

gdg452 commented 1 year ago

Thank you so much for the fantastic code! could you please share the center frame of snippets feature extracted by the image encoder of CLIP if you are convenient

Kashu7100 commented 1 year ago

@gdg452 We publish the code for our new work: https://github.com/UARK-AICV/VLTinT In the repo, there are more details about how to extract features. I hope this is helpful to you.