URL

https://arxiv.org/abs/2405.05904
Affiliations
- Zorik Gekhman, N/A
- Gal Yona, N/A
- Roee Aharoni, N/A
- Matan Eyal, N/A
- Amir Feder, N/A
- Roi Reichart, N/A
- Jonathan Herzig, N/A
  Abstract
- When large language models are aligned via supervised fine-tuning, they mayencounter new factual information that was not acquired through pre-training.It is often conjectured that this can teach the model the behavior ofhallucinating factually incorrect responses, as the model is trained togenerate facts that are not grounded in its pre-existing knowledge. In thiswork, we study the impact of such exposure to new knowledge on the capabilityof the fine-tuned model to utilize its pre-existing knowledge. To this end, wedesign a controlled setup, focused on closed-book QA, where we vary theproportion of the fine-tuning examples that introduce new knowledge. Wedemonstrate that large language models struggle to acquire new factualknowledge through fine-tuning, as fine-tuning examples that introduce newknowledge are learned significantly slower than those consistent with themodel's knowledge. However, we also find that as the examples with newknowledge are eventually learned, they linearly increase the model's tendencyto hallucinate. Taken together, our results highlight the risk in introducingnew factual knowledge through fine-tuning, and support the view that largelanguage models mostly acquire factual knowledge through pre-training, whereasfine-tuning teaches them to use it more efficiently.
  Translation (by gpt-3.5-turbo)
大規模言語モデルが監督されたfine-tuningを通じて整列されると、事前学習を通じて獲得されなかった新しい事実情報に遭遇することがあります。これにより、モデルが事実に基づかない事実に基づいて生成するように訓練されるため、事実に基づかない誤った応答を幻覚する振る舞いをモデルに教える可能性があると推測されることがよくあります。本研究では、新しい知識にさらされることがfine-tunedモデルの事前存在する知識を利用する能力に与える影響を調査しています。このため、私たちは閉書式のQAに焦点を当てた制御されたセットアップを設計し、fine-tuningの例で導入される新しい知識の割合を変化させます。私たちは、fine-tuningを通じて大規模言語モデルが新しい事実知識を獲得するのに苦労することを示しました。新しい知識を導入するfine-tuningの例は、モデルの知識と整合する例よりも著しく遅く学習されます。しかし、新しい知識を持つ例が最終的に学習されると、モデルの幻覚傾向が線形に増加することもわかりました。これらの結果から、fine-tuningを通じて新しい事実知識を導入することのリスクを強調し、大規模言語モデルは主に事前学習を通じて事実知識を獲得し、fine-tuningはそれを効率的に使用するように教えるという見方を支持しています。
Summary (by gpt-3.5-turbo)
大規模言語モデルのfine-tuningによって新しい事実情報に遭遇すると、モデルが事実に基づかない誤った応答を生成する可能性がある。本研究では、fine-tuningによる新しい知識の影響を調査し、新しい知識を持つ例が最終的に学習されると、モデルの幻覚傾向が増加することを示した。これにより、fine-tuningを通じて新しい事実知識を導入することのリスクを強調し、大規模言語モデルは主に事前学習を通じて事実知識を獲得し、fine-tuningはそれを効率的に使用するように教えるという見方を支持しています。

AkihikoWatanabe / paper_notes

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?, Zorik Gekhman+, N/A, arXiv'24 #1308

URL

Affiliations

Abstract

Translation (by gpt-3.5-turbo)

Summary (by gpt-3.5-turbo)