sudo-Boris / mr-Blip

Official Implementation of "The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval"
BSD 3-Clause "New" or "Revised" License
33 stars 0 forks source link

inference #4

Closed 1028267022 closed 1 month ago

1028267022 commented 1 month ago

can i inference for other video and text

sudo-Boris commented 1 month ago

Yes, that would be possible. You would need to format the data correctly. The feature for simple inference providing a video and a query has not been implemented yet. Just be aware of which model you want to use since there might be a domain gap between the training data and the sample on which you want to make test-time inferences.

1028267022 commented 3 weeks ago

where could i find the feature extraced code?