On the Principles of Parsimony and Self-Consistency for the Emergence of Intelligence

On the principles of Parsimony and Self-consistency for the emergence of intelligence [paper]

Motivation

Studies in neuroscience suggest that the brain's world model is highly structured anatomically and functionally and subspace coding. Such a structured model is believed to be the key to the brain's efficiency and effectiveness in perceiving, predicting, and making intelligent decisions. 大脑的世界模型在解剖学、功能及子空间编码上都是高度结构化的。这种结构化模型被认为是大脑进行高效感知、预测和智能决策的关键因素。

Currently, such expensive brute-force end-to-end training of black-box models has resulted in ever-growing model size and high data/computation cost, and is accompanied by many caveats in practice: the lack of richness in final learned representations due to neural collapse; lack of stability in training due to mode collapse; lack of adaptiveness and susceptibility to catastrophic forgetting; and lack of robustness to deformations or adversarial attacks. 目前的模型训练方式导致了不断增长的模型容量、数据需求、计算需求，并伴随一些问题：1）由于神经崩溃导致最终的表征缺乏丰富性；2）由于模式崩溃导致训练不稳定；3）缺乏适应性，容易发生灾难性遗忘；4）缺乏鲁棒性。

A principle long-learned in control theory is that such open-loop systems cannot automatically correct errors in prediction, and are unadaptive to changes in the environment. As we will argue in this paper, a similar lesson can be drawn here: once discriminative and generative models are combined to form a complete closed-loop system, learning can become autonomous (without exterior supervision), more efficient, stable, and adaptive. 控制理论中长期学习的一个原理：开环系统不能自动纠正预测中的错误，并且不适应环境的变化。本文要论证一个类似的结论：一旦判别和生成模型结合起来形成一个完整的闭环系统，学习就可以变得自主（无需外部监督），更高效、更稳定、更具适应性。

This paper aims to offer our overall position and perspective rather than to justify every claim technically. Nevertheless, we will provide references to related work where readers can find convincing theoretical and compelling empirical evidence. 本文旨在提供一个整体立场和观点，而不是从技术上证明每一项主张。文中提供了相关参考资料，可以自行查找。

Principles

提出两个基本原则：1）Parsimony，2）Self-consistency，用于解决智能中两个基础问题：

What to learn: what is the objective of learning from data, and how can it be measured?
How to learn: how can we achieve such an objective via efficient and effective computation?

Parsimony

Entities should not be multiplied unnecessarily. —— William of Ockham

The objective of learning for an intelligent system is to identify low-dimensional structures in observations of the external world and reorganize them in the most compact and structured way. 智能系统的学习目标是在观察外部世界时识别低维结构，并以最紧凑和结构化的方式重新组织它们。

以视觉信息建模为例，简约性的目标是找到一个变换 $f$ 满足以下要求：

Compression（压缩）：将高维图像数据映射到低维表示
Linearization（线性化）：将非线性子空间中每个类别的信息映射到线性空间
Sparsification（稀疏化）：将不同类别的信息映射到独立或具有最大不相关性的子空间

In other words, we try to transform real-world data that may lie on a family of low-dimensional submanifolds in a high-dimensional space onto a family of independent low-dimensional linear subspaces. Such a model is called a linear discriminative representation (LDR).

Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction

Learning Diverse and Discriminative Representations via the Principle of Maximal Coding Rate Reduction, NIPS 2020. [paper]

Yi Ma, UC Berkeley

这篇文章的 Context 和 Motivation 写的很好，清楚地说明了目前模型为什么适应性差的原因。从 Manifold hypothesis 的角度设计了一个新的学习框架。

Context & Motivation

提出目前模型学习的两个 limitations：

模型对样本 $x$ 的预测目标为 $y$，即使 $y$ 可能存在误标记
尚不清楚网络学习的中间特征在多大程度上捕获了数据的内在结构

1) It aims only to predict the labels y even if they might be mislabeled. Empirical studies show that deep networks, used as a “black box,” can even fit random labels. 2) With such an end-to-end data fitting, despite plenty of empirical efforts in trying to interpret the so-learned features, it is not clear to what extent the intermediate features learned by the network capture the intrinsic structures of the data that make meaningful classification possible in the first place.

上述问题导致模型学到的特征通常缺乏可解释性，无法保证泛化性、鲁棒性和可迁移性。本文的目标是重新制定模型学习目标，将标签 $y$ 仅仅作为辅助信息帮助模型学习更鲁棒的特征。

The precise geometric and statistical properties of the learned features are also often obscured, which leads to the lack of interpretability and subsequent performance guarantees (e.g., generalizability, transferability, and robustness, etc.) in deep learning. Therefore, the goal of this paper is to address such limitations of current learning frameworks by reformulating the objective towards learning explicitly meaningful representations for the data $x$.

论文中也提到了在 multi-modal 下现有模型学习的劣势：

When the data contain complicated multi-modal structures, naive heuristics or inaccurate metrics may fail to capture all internal subclass structures or to explicitly discriminate among them for classification or clustering purposes.

chaos-moon / paper_daily