Use demo with finetuned checkpoint

boheumd / MA-LMM

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

MIT License

221 stars 26 forks source link

Is it possible to use a finetuned checkpoint in the demo, specifically for ActivityNet-QA?

Right now the API for finetuned models is kinda locked away in the train and eval scripts, which makes it hard to play with. I'm impressed by the zero-shot capabilities exhibited in the demo and would like to interact with a finetuned model to see how well it understands the time position embeddings (questions like: what happened in the last N frames or T seconds).

Can I adapt the Blip2VicunaInstruct_MALMM class to accept one of the finetuned checkpoint in saved_models.tar from the README?

boheumd / MA-LMM

Use demo with finetuned checkpoint #10