URL

https://arxiv.org/abs/2404.09937
Affiliations
- Yuzhen Huang, N/A
- Jinghan Zhang, N/A
- Zifei Shan, N/A
- Junxian He, N/A
  Abstract
- There is a belief that learning to compress well will lead to intelligence.Recently, language modeling has been shown to be equivalent to compression,which offers a compelling rationale for the success of large language models(LLMs): the development of more advanced language models is essentiallyenhancing compression which facilitates intelligence. Despite such appealingdiscussions, little empirical evidence is present for the interplay betweencompression and intelligence. In this work, we examine their relationship inthe context of LLMs, treating LLMs as data compressors. Given the abstractconcept of "intelligence", we adopt the average downstream benchmark scores asa surrogate, specifically targeting intelligence related to knowledge andcommonsense, coding, and mathematical reasoning. Across 12 benchmarks, ourstudy brings together 30 public LLMs that originate from diverse organizations.Remarkably, we find that LLMs' intelligence -- reflected by average benchmarkscores -- almost linearly correlates with their ability to compress externaltext corpora. These results provide concrete evidence supporting the beliefthat superior compression indicates greater intelligence. Furthermore, ourfindings suggest that compression efficiency, as an unsupervised metric derivedfrom raw text corpora, serves as a reliable evaluation measure that is linearlyassociated with the model capabilities. We open-source our compression datasetsas well as our data collection pipelines to facilitate future researchers toassess compression properly.
  Translation (by gpt-3.5-turbo)
最近、言語モデリングが圧縮と同等であることが示され、優れた圧縮を学ぶことが知性につながるという考えが広まっています。大規模言語モデル（LLMs）の成功の理由を説明する説得力のある根拠として、より高度な言語モデルの開発は基本的には知性を促進する圧縮の向上であると言えます。しかし、このような魅力的な議論にもかかわらず、圧縮と知性の相互作用に関する実証的な証拠はほとんど存在しません。本研究では、LLMsをデータ圧縮器として扱い、圧縮と知性の関係を検討します。「知性」という抽象的な概念を考えると、知識と常識、コーディング、数理推論に関連する知性を具体的に対象として、平均的なダウンストリームベンチマークスコアを代理として採用します。 12のベンチマークを通じて、様々な組織から発信された30の公開LLMsを結集しました。驚くべきことに、LLMsの知性（平均ベンチマークスコアによって反映される）は、外部テキストコーパスを圧縮する能力とほぼ線形的に相関していることがわかりました。これらの結果は、優れた圧縮がより高い知性を示すという信念を支持する具体的な証拠を提供しています。さらに、我々の調査結果は、生のテキストコーパスから導かれる教師なしのメトリクスである圧縮効率が、モデルの能力と線形的に関連していると示唆しています。将来の研究者が適切に圧縮を評価するために、圧縮データセットとデータ収集パイプラインをオープンソース化しています。
Summary (by gpt-3.5-turbo)
最近の研究では、大規模言語モデル（LLMs）をデータ圧縮器として扱い、圧縮と知性の関係を検討しています。LLMsの知性は、外部テキストコーパスを圧縮する能力とほぼ線形的に相関しており、優れた圧縮がより高い知性を示すという信念を支持する具体的な証拠を提供しています。さらに、圧縮効率はモデルの能力と線形的に関連しており、圧縮を評価するためのデータセットとパイプラインがオープンソース化されています。

AkihikoWatanabe / paper_notes

Compression Represents Intelligence Linearly, Yuzhen Huang+, N/A, arXiv'24 #1288

URL

Affiliations

Abstract

Translation (by gpt-3.5-turbo)

Summary (by gpt-3.5-turbo)