YueFan1014 / VideoAgent

This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)
Apache License 2.0
74 stars 5 forks source link

getting this error: #2

Closed monjha closed 1 week ago

monjha commented 2 months ago

getting this error while running the default videos: visual_question_answering(("how many boats are there in the video?", 0)) is not a valid tool, try one of [caption_retrieval, segment_localization, visual_question_answering, object_memory_qu erying].

YueFan1014 commented 2 months ago

Hi, it seems that the LLM does not call the tool in correct format. Please try other LLMs with stronger Instruction following ability (such as GPT-4-turbo). You can also run multiple times to answer the question.