marp: true theme: gaia size: 16:9 paginate: true headingDivider: 2 header: Denoising Diffusion Probabilistic Models footer: ©AMBL 2022

Denoising Diffusion Probabilistic Models

論文情報

タイトル	Denoising Diffusion Probabilistic Models
発表年	2020/12/16
URL	https://arxiv.org/abs/2006.11239
github	https://github.com/hojonathanho/diffusion

TL;DR

画像を純粋なノイズから生成するモデルである。

従来の画像生成系モデル（VAE, GAN等）よりも高度な画像生成ができる。
画像にノイズを加えること（Forward process）とノイズから画像を生成する（Backward process）の2つ処理で学習を行う。
マルコフ過程を用いることで計算を効率化させる。

モチベーション

画像生成系モデルが研究されている。

VAE: Variational Autoencoders (2013)
GAN: Generative Adversarial Networks (2014)
DDPM: Denoising Probailistic Models (2020) ← 本論文
+α: Diffusion Models Beat GAN on Image Synthesis (2021)

画像生成系モデル

Generative Model Nvidia Developer Blog

学習準備

DDPMの学習には以下の3つのものが必須となる。

Noise Scheduler
Neural Network (Backbone)
Timestamp Encoding

Forward Process

$q(x_{i:T}|xo)=\prod{t}^{T}q(xt|x{t-1})$

そして、

$q(x{t}|x{t-1})=N(x_t; \sqrt{1-\betat}{x{t-1},\beta_t{I}})$

$\beta_t$: $t$時点の分散量

-- $q(x{t}|x{0})=N(xt; \sqrt{\bar{\alpha}}{x{0},(1-\bar\alpha_t)I})$

Backward Process

$p{\theta}(x{0:T}):=p{\theta}(x{T})\prod{t}^{T}p\theta(x{t-1}|x{t})$

そして、

$p{\theta}(x{t-1}|x{t})=N(x{t-1}; \mu_{\theta}(x_t,t), \sigma^2_tI)$ for $1 < t \leq T$

$\mu_{\theta}$: 平均値

$x_{t-1}\approx x_t-e$

手法

width:1100px

Neural Network

width:900px

入力と出力の形は同じである。

目的関数

様々なマホウのあとで...

$\epsilon$: 真のノイズ

$\epsilon_\theta$: 予測ノイズ

実験結果

DEMO

width:400px

所感

楽しい。

物理学の概要をMLに適応した。 => ランジュバン動力学

参考

マホウ：https://lilianweng.github.io/posts/2021-07-11-diffusion-models/ 簡易実装: https://colab.research.google.com/drive/1sjy9odlSSy0RBVgMTgP7s99NXsqglsUL?usp=sharing

aifield / CV_News

#11 Denoising Diffusion Probabilistic Models #11