Hi! We have simply extended MiniGPT-4 for video question answering in our project Ask-Anything. Without extra instruction fine-tuning, current results are not satisfactory.
In our other try, we simply encode the video as captions, and input them with ChatGPT, which provides better results.
Now we are trying to build a real video ChatBot with fantastic techniques as used in MiniGPT-4 and Llava. Hopefully, everyone can try our demo, and find the problem, we will try our best to fix it in our future ChatBot.
Hi! We have simply extended MiniGPT-4 for video question answering in our project Ask-Anything. Without extra instruction fine-tuning, current results are not satisfactory.
In our other try, we simply encode the video as captions, and input them with ChatGPT, which provides better results.
Now we are trying to build a real video ChatBot with fantastic techniques as used in MiniGPT-4 and Llava. Hopefully, everyone can try our demo, and find the problem, we will try our best to fix it in our future ChatBot.