opendatalab / HA-DPO

Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization
https://opendatalab.github.io/HA-DPO
Apache License 2.0
64 stars 5 forks source link

Question about the code. #6

Closed tbbbk closed 6 months ago

tbbbk commented 6 months ago

Here Why does this variable A need to be multiplied by two here?

Best wishes.

tbbbk commented 6 months ago

Here Why does this variable A need to be multiplied by two here?

Best wishes.

Typo: A -> desc_data_dict

JulioZhao97 commented 6 months ago

This ratio controls the degree of optimization for the two different types of data, and we found in our experiments that for llava, the model was able to achieve the best hallucination elimination on both image description and question answering when this ratio was controlled at 2:1. You can change other values in the experiment to observe the effect in different cases.

tbbbk commented 6 months ago

This ratio controls the degree of optimization for the two different types of data, and we found in our experiments that for llava, the model was able to achieve the best hallucination elimination on both image description and question answering when this ratio was controlled at 2:1. You can change other values in the experiment to observe the effect in different cases.

I apologize, it may be my oversight. Could you please tell me which part of the paper mentions 'ratio'?

JulioZhao97 commented 6 months ago

This ratio controls the degree of optimization for the two different types of data, and we found in our experiments that for llava, the model was able to achieve the best hallucination elimination on both image description and question answering when this ratio was controlled at 2:1. You can change other values in the experiment to observe the effect in different cases.

I apologize, it may be my oversight. Could you please tell me which part of the paper mentions 'ratio'?

We didn't elaborate this much in the current version of the paper since this hyper-parameter can be left for the user to decide and doesn't have much influence on the performance. We will add this in the next version of the paper.