mlpc-ucsd / BLIVA

(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
https://arxiv.org/abs/2308.09936
BSD 3-Clause "New" or "Revised" License
271 stars 28 forks source link

evaluate code doesn't exist #24

Open mingtouyizu opened 4 months ago

mingtouyizu commented 4 months ago

Could you plrease provide a configuration file for the model evaluation and the related code? I have found builders such as flickr, textvqa, nocaps, as well as the corresponding evaluation code. Although I have tried to write the evaluation code like LAVIS, it doesn't work.Thanks.

mingtouyizu commented 4 months ago

The issue is as follows:Lack of Q-former , llm-project and vision-projection weights: image

jiinhui commented 4 months ago

Could you plrease provide a configuration file for the model evaluation and the related code? I have found builders such as flickr, textvqa, nocaps, as well as the corresponding evaluation code. Although I have tried to write the evaluation code like LAVIS, it doesn't work.Thanks.

NotImplemented def evaluation(self, model, data_loader, cuda_enabled=True): pass

gordonhu608 commented 4 months ago

Sorry we didn't not incorporate the evaluation code into our codebase, it's a bit messy to put in here, and all the evaluation code is under the LAVIS repo.

mingtouyizu commented 4 months ago

Thank you, I have found the corresponding test code in the LAVIS repository and evaluated on NoCaps and Flickr30k, but the results are not ideal.