What are the efuf’s advantages over DPO?

starreeze / efuf

the official repo for EMNLP 2024 (main) paper "EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models"

GNU General Public License v3.0

10 stars 4 forks source link

What are the efuf’s advantages over DPO? #5

Closed waltonfuture closed 3 hours ago

waltonfuture commented 4 hours ago

Both efuf and DPO use positive and negative data for learning. What are the advantages of efuf over DPO? Thanks.

starreeze commented 3 hours ago

The major difference is that DPO uses paired positive and negative data for finetuning.

By paired, I mean a pair of negative and positive responses from the same question. However, EFUF does not need paired response. It just needs negative or positive response from any question. Therefore, EFUF allows for easier and more efficient data collecting.

Besides, EFUF utilizes a fine-grained approach, contributing to better performance.

waltonfuture commented 3 hours ago

Thank you. I got it. Have you tried efuf in LLM settings besides MLLM? It sounds interesting if efuf can have more benefits than DPO for LLMs.

starreeze commented 3 hours ago

Not yet. It's more natural to use CLIP to indicate multimodal hallucinations, but quite difficult to find any reliable external source to determine hallucinations in LLM, since the latter is related more with knowledge. As you know, things get really complicated if associated with knowledge.

Message ID: @.***>

waltonfuture commented 3 hours ago

Thank you for your response.