greenelab / deep-review

A collaboratively written review paper on deep learning, genomics, and precision medicine
https://greenelab.github.io/deep-review/
Other
1.25k stars 271 forks source link

Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer #899

Open evancofer opened 6 years ago

evancofer commented 6 years ago

Autoencoders provide a powerful framework for learning compressed representations by encoding all of the information needed to reconstruct a data point in a latent code. In some cases, autoencoders can “interpolate”: By decoding the convex combination of the latent codes for two datapoints, the autoencoder can produce an output which semantically mixes characteristics from the datapoints. In this paper, we propose a regularization procedure which encourages interpolated outputs to appear more realistic by fooling a critic network which has been trained to recover the mixing coefficient from interpolated data. We then develop a simple benchmark task where we can quantitatively measure the extent to which various autoencoders can interpolate and show that our regularizer dramatically improves interpolation in this setting. We also demonstrate empirically that our regularizer produces latent codes which are more effective on downstream tasks, suggesting a possible link between interpolation abilities and learning useful representations.

https://arxiv.org/abs/1807.07543v2

evancofer commented 6 years ago

This seems highly relevant to the section on latent space manipulation.

stephenra commented 6 years ago

Thanks @evancofer for bringing this up. In the broader context of interpretable semantic information as well as traversing the latent space representation, would it be informative to add papers addressing disentanglement?

Tagging @gwaygenomics given his experience and research in this area.

gwaybio commented 6 years ago

thanks @stephenra - definitely near the top of the list! Once I read, I will add a summary here

gwaybio commented 6 years ago

Ok, I put aside some time this afternoon to give this a quick read. The article was very well written. I will add my summary below:

Computational Methods

Biological Relevance

There are many biological and biomedical applications in which accurate interpolation between existing data points can provide immediate benefit. Similar benchmarking of models in interpolation tasks in biological domains is definitely of interest. In the present paper, there are no references to specific biomedical applications, but it may be worth mentioning in the latent space manipulation section as directly applicable future directions.