RECOMMEND/VOTE Papers for the 2nd session

sxjscience commented 8 years ago

We can recommend some papers for further discussion under this issue. Include a link to the paper + the conference name and other related information (like the abstract, some basic descriptions, links to samples code or online demonstrations).

Please only include one topic per comment. For example, if you propose to discuss "paper X" which is heavily based on "paper Y" and you believe both have to be read together (possibly over multiple weeks) just create one comment for that. If you propose two unrelated papers please create two comments.

The voting period ends up on 2016/09/25. We are free to recommend new papers.

Please vote using the "Thumbs up" emoji.

sxjscience commented 8 years ago

Continue reading the paper recommended in previous session.

WaveNet: A Generative Model for Raw Audio (arXiv - submitted on 12 Sep 2016) This paper introduces WaveNet, a deep neural network for generating raw audio waveforms. The model is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones; nonetheless we show that it can be efficiently trained on data with tens of thousands of samples per second of audio. When applied to text-to-speech, it yields state-of-the-art performance, with human listeners rating it as significantly more natural sounding than the best parametric and concatenative systems for both English and Mandarin. A single WaveNet can capture the characteristics of many different speakers with equal fidelity, and can switch between them by conditioning on the speaker identity. When trained to model music, we find that it generates novel and often highly realistic musical fragments. We also show that it can be employed as a discriminative model, returning promising results for phoneme recognition.

Also read

Conditional Image Generation with PixelCNN Decoders (NIPS 2016)

leezu commented 8 years ago

During our last discussion this paper was mentioned.

On the difficulty of training recurrent neural networks (Razvan Pascanu, Tomas Mikolov, Yoshua Bengio) (ICML2013 ) There are two widely known issues with properly training recurrent neural networks, the vanishing and the exploding gradient problems detailed in Bengio et al. (1994). In this paper we attempt to improve the understanding of the underlying issues by exploring these problems from an analytical, a geometric and a dynamical systems perspective. Our analysis is used to justify a simple yet effective solution. We propose a gradient norm clipping strategy to deal with exploding gradients and a soft constraint for the vanishing gradients problem. We validate empirically our hypothesis and proposed solutions in the experimental section.

leezu commented 8 years ago

Training Very Deep Networks (Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber) (NIPS 2015) Theoretical and empirical evidence indicates that the depth of neural networks is crucial for their success. However, training becomes more difficult as depth increases, and training of very deep networks remains an open problem. Here we introduce a new architecture designed to overcome this. Our so-called highway networks allow unimpeded information flow across many layers on information highways. They are inspired by Long Short-Term Memory recurrent networks and use adaptive gating units to regulate the information flow. Even with hundreds of layers, highway networks can be trained directly through simple gradient descent. This enables the study of extremely deep and efficient architectures.

sxjscience commented 8 years ago

Dueling Network Architectures for Deep Reinforcement Learning (Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas) (ICML 2016) In recent years there have been many successes of using deep representations in reinforcement learning. Still, many of these applications use conventional architectures, such as convolutional networks, LSTMs, or auto-encoders. In this paper, we present a new neural network architecture for model-free reinforcement learning. Our dueling network represents two separate estimators: one for the state value function and one for the state-dependent action advantage function. The main benefit of this factoring is to generalize learning across actions without imposing any change to the underlying reinforcement learning algorithm. Our results show that this architecture leads to better policy evaluation in the presence of many similar-valued actions. Moreover, the dueling architecture enables our RL agent to outperform the state-of-the-art on the Atari 2600 domain.

yzhangee commented 8 years ago

Variational Autoencoder for Deep Learning of Images, Labels and Captions (Yunchen Pu, Zhe Gan, Ricardo Henao, Xin Yuan, Chunyuan Li, Andrew Stevens and Lawrence Carin) (NIPS 2016) A novel variational autoencoder is developed to model images, as well as associated labels or captions. The Deep Generative Deconvolutional Network (DGDN) is used as a decoder of the latent image features, and a deep Convolutional Neural Network (CNN) is used as an image encoder; the CNN is used to approximate a distribution for the latent DGDN features/code. The latent code is also linked to generative models for labels (Bayesian support vector machine) or captions (recurrent neural network). When predicting a label/caption for a new image at test, averaging is performed across the distribution of latent codes; this is computationally efficient as a consequence of the learned CNN-based encoder. Since the framework is capable of modeling the image in the presence/absence of associated labels/captions, a new semi-supervised setting is manifested for CNN learning with images; the framework even allows unsupervised CNN learning, based on images alone.

yyuanad commented 8 years ago

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets (Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel) (NIPS2016) This paper describes InfoGAN, an information-theoretic extension to the Generative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner. InfoGAN is a generative adversarial network that also maximizes the mutual information between a small subset of the latent variables and the observation. We derive a lower bound of the mutual information objective that can be optimized efficiently. Specifically, InfoGAN successfully disentangles writing styles from digit shapes on the MNIST dataset, pose from lighting of 3D rendered images, and background digits from the central digit on the SVHN dataset. It also discovers visual concepts that include hair styles, presence/absence of eyeglasses, and emotions on the CelebA face dataset. Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing supervised methods.

yyuanad commented 8 years ago

Harnessing Deep Neural Networks with Logic Rules(Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard Hovy, Eric P. Xing) (ACL2016)(one of the 10 outstanding papers) Combining deep neural networks with structured logic rules is desirable to harness flexibility and reduce uninterpretability of the neural models. We propose a general framework capable of enhancing various types of neural networks (e.g., CNNs and RNNs) with declarative first-order logic rules. Specifically, we develop an iterative distillation method that transfers the structured information of logic rules into the weights of neural networks. We deploy the framework on a CNN for sentiment analysis, and an RNN for named entity recognition. With a few highly intuitive rules, we obtain substantial improvements and achieve state-of-the-art or comparable results to previous best-performing systems.

leezu commented 8 years ago

Thank you for the votes and the proposals.

As we agreed after the last session it makes more sense to have one person that leads through the discussion and prepares a short presentation for the main points. As I proposed the paper I will prepare this for Friday.

I think we can copy the other proposed papers that got at least one vote to the issue for next week (?).

ML-HK / paper-discussion-group

RECOMMEND/VOTE Papers for the 2nd session #2