Open waxz opened 7 years ago
方法描述 Images in CelebA have 40 binary attributes. I thought it would be nice to be able to take an image of a face and modify it to make it look younger or change the hair color. Remember from part 1 that one of the promises of GANs is that you can perform operations in latent space that are reflected in feature space. CelebA数据集带有40个二值化特征的标签。通过对隐变量的运算,生成不同特征的的图像。 In order to modify attributes, first I needed to find a z vector representing each attribute. So first I used E to compute the z vector for each image in the dataset. Then I calculated attribute vectors as follows: for example, to find the attribute vector for “young” I subtracted the average z vector of all images that don’t have the “young” attribute from the average z vector of all images that have it. I ended up with a 40×100 matrix Z_{attr} of characteristic z vectors, one for each of the 40 attributes in CelebA.
计算每个特征的隐变量。首先计算带有这一特征(比如年轻)的所有图像的编码得到的隐变量的平均值,以及不带年轻特征的所有图像的编码得到的隐变量的平均值。 两个平均值相见得到年轻特征对应的隐变量。
谷歌学术搜索
隐变量空间分析
1 论文的目的和效果 2 实现方法。
Semantically Decomposing the Latent Spaces of Generative Adversarial Networks https://arxiv.org/abs/1705.07904 https://github.com/chrisdonahue/sdgan
https://arxiv.org/abs/1709.02023 CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training https://github.com/mkocaoglu/CausalGAN Compressed Sensing using Generative Models https://arxiv.org/abs/1703.03208
https://arxiv.org/abs/1710.11381 Semantic Interpolation in Implicit Models
https://arxiv.org/abs/1710.07035 Generative Adversarial Networks: An Overview
https://arxiv.org/abs/1707.05776 Optimizing the Latent Space of Generative Networks
PRECISE RECOVERY OF LATENT VECTORS FROM GENERATIVE ADVERSARIAL NETWORKS https://github.com/yxlao/pytorch-reverse-gan
memory网络
https://arxiv.org/abs/1702.04648 GENERATIVE TEMPORAL MODELS WITH MEMORY https://arxiv.org/abs/1710.07829 Superposed Episodic and Semantic Memory via Sparse Distributed Representation
video
mocogan
https://arxiv.org/abs/1706.08033 DECOMPOSING MOTION AND CONTENT FOR NATURAL VIDEO SEQUENCE PREDICTION
https://arxiv.org/abs/1702.04125 One-Step Time-Dependent
Future Video Frame Prediction with a Convolutional Encoder-Decoder Neural Network
https://arxiv.org/abs/1709.07592 Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks
https://arxiv.org/abs/1703.02291 Triple Generative Adversarial Nets
https://arxiv.org/abs/1708.05980 Attentive Semantic Video Generation using Captions
paper: A Survey on Deep Video Prediction
Causal GAN TensorFlow https://github.com/mkocaoglu/CausalGAN
inference 模式的notebook
https://github.com/createamind/busyplan/blob/master/zhangwei/inference.ipynb
隐变量空间分析文献阅读总结: Semantically Decomposing the Latent Spaces of Generative Adversarial Networks https://arxiv.org/abs/1705.07904 1)该文章讲隐变量认为分为两部分,Zi代表物体identity,Zo代表物体observation(例如姿态,光照,头发等) 2)每次采样都有同一个物体的两幅真实图片(图片中物体的observation不同),label为1,同时Generator根据具有不同隐变量(Zi,Zo1)(Zi,Zo2)生成两张虚假图片,label为0. 这些图片都输入给分类器去判断,并进行迭代,更新分类器和生成器的参数 3)只有当Generator生成的两张图片都显示为同一个物体,并且具有明显不同的observation,而且和真实图片特别逼真,那么才算训练成功。
Compressed Sensing using Generative Models https://arxiv.org/abs/1703.03208 该文章是介绍GAN在数据处理方面的作用,例如,提供一个压缩后的图像,如何使用GAN还原成压缩前的图像。并未提及任何隐变量的信息。
https://arxiv.org/abs/1710.11381 Semantic Interpolation in Implicit Models 该文章主要介绍了针对GAN模型评价时,如何进行插值的问题 1)常规方法是隐变量是Normal distribution,并进行线性插值,导致会出现插值的结果可能是发生概率很小的情况的图像 2)因此,有人提出spherical interpolation, 但依然是隐变量是Normal distribution,作者认为插值后的结果尽管看上去很有趣,但可能并不是我们想要的semantic 上的插值图像 3)最后提出隐变量应该采取 gamma distribution,但依然采用线性插值,并列上很多图片作为证明作者的这种方法能得到更好的插值效果。
https://arxiv.org/abs/1710.07035 Generative Adversarial Networks: An Overview 该文章是一份关于GAN的综述,宏观上讲述了GAN的分类和一些用处
https://arxiv.org/abs/1709.02023 CausalGAN: Learning Causal Implicit Generative 1) 效果,在label间因果关系清晰的情况下,可以利用先验因果图(causal graph)提取某一特征,并将该特征应用到非该特征原分布的分布中:例如,生成长胡子的女性。 2)对于如何提取Z没有论述。
beta vae: https://scholar.google.com/scholar?um=1&ie=UTF-8&lr&cites=9898751721018572733
https://arxiv.org/abs/1711.00464
https://arxiv.org/abs/1711.00583
https://arxiv.org/abs/1711.00848 our approach does not introduce any extra conflict between disentanglement of the latents and the observed data likelihood, which is reflected in the overall quality of the generated samples that matches the VAE and is much better than -VAE. This does not come at the cost of higher entanglement and our approach also outperforms -VAE in disentangling the latents as measured by various quantitative metrics.
https://arxiv.org/abs/1707.08475 https://arxiv.org/abs/1707.08475
https://arxiv.org/abs/1611.01353 Information Dropout https://github.com/ucla-vision/information-dropout https://github.com/ganow/keras-information-dropout prove that we can promote the creation of disentangled representations simply by enforcing a factorized prior
DR-GAN http://cvlab.cse.msu.edu/pdfs/Tran_Yin_Liu_CVPR2017.pdf https://github.com/kayamin/DR-GAN
paper 1.Learnable Explicit Density for Continuous Latent Space and Variational Inference https://arxiv.org/pdf/1710.02248v1.pdf 2.Latent Space Oddity: on the Curvature of Deep Generative Models https://arxiv.org/pdf/1710.11379v1.pdf 3.GLSR-VAE: Geodesic Latent Space Regularization for Variational AutoEncoder Architectures https://arxiv.org/pdf/1707.04588v1.pdf 4.A CLASSIFICATION–BASED PERSPECTIVE ON GAN DISTRIBUTIONS https://arxiv.org/pdf/1711.00970v1.pdf
https://arxiv.org/abs/1706.00409 Fader Networks: Manipulating Images by Sliding Attributes
视频路上跑很多车,用跑的车标注出路和非路的视觉区别
https://devblogs.nvidia.com/parallelforall/photo-editing-generative-adversarial-networks-2/ 基于nvidia-digits的实现 https://github.com/gheinrich/DIGITS-GAN/blob/DIGITS-GAN-v0.1/examples/gan/README.md