Reducing hallucinations in Blenderbot2 (400M)

sjscotti commented 2 years ago

Hi I’ve been experimenting with fine-tuning BB2 (400M) for a domain-specific application and I find that it is “hallucinating” by combining training info from different dialogues I provided in the training. The bot responses have a nice conversational quality, but they are often factually wrong. Are there settings that can be adjusted where I can sacrifice some of the fluidity of the responses from the bot to get less hallucination in its responses? Thanks!

klshuster commented 2 years ago

Could you clarify what you mean by combining "training info from different dialogues"? Are these part of the knowledge source?

One thing you can try doing is changing the model to use RAG-Sequence; we have found that RAG-Sequence tends to hallucinate less at the expense of conversational fluidity (see results from this project); you can do so by setting --model projects.blenderbot2.agents.blenderbot2:BlenderBot2RagAgent --rag-model-type sequence.

Another thing you can try is reducing the number of documents to condition on during inference time

github-actions[bot] commented 2 years ago

This issue has not had activity in 30 days. Please feel free to reopen if you have more issues. You may apply the "never-stale" tag to prevent this from happening.

renzs6 commented 2 years ago

Hi @klshuster how can I limit the number of documents to condition during inference time? particularly on a config-based setup.

klshuster commented 2 years ago

you can set the --n-docs parameter; in a config setup, it would be n_docs (and you would want to make sure you set it both at the top-level opt and in the override options as well.

sjscotti commented 2 years ago

Hi @klshuster I recently have switched fine-tuning of BlenderBot 2.0 on my own corpus going from the 400M BB2 model to the 3B BB2 model, and this issue has popped up again. In following the BB2-3B model card, I had been using --model projects.blenderbot2.agents.blenderbot2:BlenderBot2FidAgent --rag-model-type token and --dpr-model-file zoo:hallucination/bart_rag_token/model. I tried to follow the suggestion you made above (i.e., --model projects.blenderbot2.agents.blenderbot2:BlenderBot2RagAgent --rag-model-type sequence), and I am finding that that using the model projects.blenderbot2.agents.blenderbot2:BlenderBot2RagAgent uses significantly more GPU memory than projects.blenderbot2.agents.blenderbot2:BlenderBot2FidAgent when training. Also, I noticed there are appear to be two options for dpr-model-file: either zoo:hallucination/bart_rag_token/model or zoo:hallucination/bart_rag_sequence/model. Perhaps a curve in all this is that in my training, I was having issues with the training crashing when the internet search would stop responding, so I in running parlai train_model I have added --knowledge-access-method memory_only --search-server none. With this background, I have a few questions.

Am I OK doing the fine tuning of BB2-3B without an internet search? If so, is --knowledge-access-method memory_only --search-server none the correct way to do it?
Can I fine tune using the model card suggestions and just use --model projects.blenderbot2.agents.blenderbot2:BlenderBot2RagAgent --rag-model-type sequence when I run an evaluation (i.e., when I run parlai eval_model or parlai interactive)?
Should I be using the dpr-model-file zoo:hallucination/bart_rag_sequence/model instead of zoo:hallucination/bart_rag_token/model for the best reduction in hallucination?

Thanks in advance for your help!

klshuster commented 2 years ago

Specifying --knowledge-access-method memory_only --search-server none will indeed bypass searching. That is fine if that is your intention.
Yes, that should be fine.
Either model file works, as both are trained to retrieve based on conversational context

sjscotti commented 2 years ago

@klshuster Thanks!

facebookresearch / ParlAI

Reducing hallucinations in Blenderbot2 (400M) #3988