AkihikoWatanabe commented 4 hours ago

URL

https://arxiv.org/abs/2410.19750
Authors
- Yuxiao Li
- Eric J. Michaud
- David D. Baek
- Joshua Engels
- Xiaoqing Sun
- Max Tegmark
  Abstract
- Sparse autoencoders have recently produced dictionaries of high-dimensional vectors corresponding to the universe of concepts represented by large language models. We find that this concept universe has interesting structure at three levels: 1) The "atomic" small-scale structure contains "crystals" whose faces are parallelograms or trapezoids, generalizing well-known examples such as (man-woman-king-queen). We find that the quality of such parallelograms and associated function vectors improves greatly when projecting out global distractor directions such as word length, which is efficiently done with linear discriminant analysis. 2) The "brain" intermediate-scale structure has significant spatial modularity; for example, math and code features form a "lobe" akin to functional lobes seen in neural fMRI images. We quantify the spatial locality of these lobes with multiple metrics and find that clusters of co-occurring features, at coarse enough scale, also cluster together spatially far more than one would expect if feature geometry were random. 3) The "galaxy" scale large-scale structure of the feature point cloud is not isotropic, but instead has a power law of eigenvalues with steepest slope in middle layers. We also quantify how the clustering entropy depends on the layer.
  Translation (by gpt-4o-mini)
スパースオートエンコーダは、最近、大規模言語モデルによって表現される概念の宇宙に対応する高次元ベクトルの辞書を生成しました。我々は、この概念の宇宙が三つのレベルで興味深い構造を持っていることを発見しました。1) 「原子」小規模構造は、平行四辺形や台形の面を持つ「結晶」を含んでおり、これは（男-女-王-女王）などのよく知られた例を一般化しています。このような平行四辺形と関連する関数ベクトルの質は、単語の長さなどのグローバルな干渉方向を投影して除去することで大幅に改善され、これは線形判別分析を用いて効率的に行われます。2) 「脳」中規模構造は、重要な空間的モジュラリティを持っています。例えば、数学とコードの特徴は、神経fMRI画像で見られる機能的な葉に似た「葉」を形成します。これらの葉の空間的局所性を複数の指標で定量化し、粗いスケールで共起する特徴のクラスターが、特徴の幾何学がランダムである場合に期待されるよりもはるかに空間的に集まることを発見しました。3) 「銀河」スケールの大規模構造である特徴点雲は各向同性ではなく、代わりに中間層で最も急な傾斜を持つ固有値のべき法則を持っています。また、クラスタリングエントロピーが層に依存する様子も定量化しました。
Summary (by gpt-4o-mini)
スパースオートエンコーダは、高次元ベクトルの辞書を生成し、概念の宇宙に三つの興味深い構造を発見した。1) 小規模構造では、平行四辺形や台形の「結晶」があり、単語の長さなどの干渉を除去することで質が改善される。2) 中規模構造では、数学とコードの特徴が「葉」を形成し、空間的局所性が定量化され、特徴が予想以上に集まることが示された。3) 大規模構造では、特徴点雲が各向同性でなく、固有値のべき法則を持ち、クラスタリングエントロピーが層に依存することが定量化された。

AkihikoWatanabe commented 4 hours ago

参考: https://ledge.ai/articles/llm_conceptual_structure_sae

AkihikoWatanabe commented 3 hours ago

Perplexity

AkihikoWatanabe / paper_notes

The Geometry of Concepts: Sparse Autoencoder Feature Structure, Yuxiao Li+, arXiv'24 #1522

URL

Authors

Abstract

Translation (by gpt-4o-mini)

Summary (by gpt-4o-mini)