inference - Githubissues

simon-ging / coot-videotext

COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning

Apache License 2.0

288 stars 55 forks source link

Open LilyTheBear opened 1 year ago

LilyTheBear commented 1 year ago

Hi, Do you have a sample inference code to load the model, pre-process video and text, and get the similarity score ?

Thanks !

simon-ging commented 1 year ago

Hi, no, sorry, please use the commands and instructions from the readme. Pull requests for such a code are welcome