showlab / VLog

Transform Video as a Document with ChatGPT, CLIP, BLIP2, GRIT, Whisper, LangChain.
MIT License
528 stars 26 forks source link

The events where crime is occurring are not described by the model #11

Open aliayub40995 opened 1 month ago

aliayub40995 commented 1 month ago

In case of buy_watermelon video, the most important event was the "customer stabbing the seller" but it was not described. I have tried it with some other crime videos as well, but the critical events were not highlighted either. Please suggest what needs to be done in such a case