Dynamic Routing Between Capsules

msrks commented 7 years ago

Hinton大先生の論文。neuron集団のcapsuleをつくりcapsule単位での意味合い(絶対値が存在確率、各要素がfeature)を持たせることで、CNNよりもロバストに入力変化に対応できる（詳細後述）。activationやpoolingもcapsule単位の処理に対応したものに置き換え（それぞれ、sigmoid/relu ->squash / max pool -> dynamic routing)。MNISTでSOTA。

https://arxiv.org/abs/1710.09829

A capsule is a group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or object part. We use the length of the activity vector to represent the probability that the entity exists and its orientation to represent the instantiation paramters. Active capsules at one level make predictions, via transformation matrices, for the instantiation parameters of higher-level capsules. When multiple predictions agree, a higher level capsule becomes active. We show that a discrimininatively trained, multi-layer capsule system achieves state-of-the-art performance on MNIST and is considerably better than a convolutional net at recognizing highly overlapping digits. To achieve these results we use an iterative routing-by-agreement mechanism: A lower-level capsule prefers to send its output to higher level capsules whose activity vectors have a big scalar product with the prediction coming from the lower-level capsule.

msrks commented 7 years ago

全体像

Capsule単位の特徴を抽出するPrimary Capsと、そのCapsule特徴からDynamic Routingでさらに上位の特徴を取ってくるDigit Capsからなる

Primary Caps は

Conv + Activation からなる。Convした結果を、Capsuleごとに区切って、Capsule単位でActivationする。Capuleに対するActivationは（各要素にReluではなく）Capsuleベクトルの方向を維持して絶対値だけを変化させる。ベクトルのノルムに存在確率の意味を持たせるために、値域を[0,1]にSquashする。

Activation: 2017-11-08 19 53 11

Digit Caps は

Conv + N*(Weighted Sum of Vector + Routing + Weight Update) からなる。

Capuleに対するPoolingの代替がDynamic Routing。N回のRouting+Weight Updateモジュールでは、それぞれの特徴量の出力capsuleの特徴方向への寄与を予測し、予測出力ベクトルに対して内積の大きな特徴方向への寄与を生み出すと予測されるcapsule特徴からの入力への重みを大きくするようにWeight Updateする。

補助モジュール

Digit Caps特徴（各カプセルの絶対値が予測ラベルに対応する）からReconstructionするモジュール。学習では Reconstruction Lossを減らす項を損失関数に追加し、regularizationとして使っている。学習後は、こいつを使って、各capsuleのelementが意味のある特徴を抽出していることが可視化できる。

msrks commented 7 years ago

MNISTスコア

(Routing 3, Reconstruction Lossあり) が一番つよい

msrks commented 7 years ago

https://hackernoon.com/capsule-networks-are-shaking-up-ai-heres-how-to-use-them-c233a0971952

msrks commented 7 years ago

https://mosko.tokyo/post/on-capusels/

furukawa-ai / deeplearning_papers