opendatalab / HA-DPO

Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
https://opendatalab.github.io/HA-DPO
Apache License 2.0
66 stars 6 forks source link

Only minigpt4 use λLaux loss? #5

Closed crazy-wq closed 7 months ago

crazy-wq commented 8 months ago

Only minigpt4 use λLaux loss in the source code?

JulioZhao97 commented 7 months ago

Yes. In our experiment, we found that the MiniGPT4, even after fine-tuning with style-consistent data, still exhibits a degree of degeneracy. Therefore, we introduced auxiliary task to further alleviate this degeneration phenomenon. For more powerful models, such as LLaVA and InstructBLIP, the preference fine-tuning does not affect the generation of the model, so auxiliary tasks were not employed. Details can be found in Sec5.2 in paper, where λ is set to 0 for LLaVA and InstructBLIP.