mlpc-ucsd / BLIVA

(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
https://arxiv.org/abs/2308.09936
BSD 3-Clause "New" or "Revised" License
257 stars 26 forks source link

reduction="none" problems #8

Closed Zhudongsheng75 closed 11 months ago

Zhudongsheng75 commented 11 months ago

outputs = self.llm_model( inputs_embeds=inputs_embeds, attention_mask=attention_mask, return_dict=True, labels=this_targets, reduction="none", ) I noticed that the code in bliva_vicuna7b.py (locate in _predict_class) maybe has some problems. My transformers version is 4.28.0 and there is no reduction parameter in LlamaForCausalLM Class. In my opinion, the _predict_class used to evaluate the vqa task such as visdual. However, the failure of reduction="none" makes MRR impossible to calculate.

Zhudongsheng75 commented 11 months ago

Sorry, I found the problem, you rewrote a LlamaForCausalLM.