why previous VAEs on text cannot learn controllable latent representation as on images, as well as a fix to enable the first success towards controlled text generation without supervision
Abstract (요약) 🕵🏻♂️
The variational autoencoder (VAE) has found success in modelling the manifold of natural images on certain datasets, allowing meaningful images to be generated while interpolating or extrapolating in the latent code space, but it is unclear whether similar capabilities are feasible for text considering its discrete nature. In this work, we investigate the reason why unsupervised learning of controllable representations fails for text. We find that traditional sequence VAEs can learn disentangled representations through their latent codes to some extent, but they often fail to properly decode when the latent factor is being manipulated, because the manipulated codes often land in holes or vacant regions in the aggregated posterior latent space, which the decoding network is not trained to process. Both as a validation of the explanation and as a fix to the problem, we propose to constrain the posterior mean to a learned probability simplex, and performs manipulation within this simplex. Our proposed method mitigates the latent vacancy problem and achieves the first success in unsupervised learning of controllable representations for text. Empirically, our method significantly outperforms unsupervised baselines and is competitive with strong supervised approaches on text style transfer. Furthermore, when switching the latent factor (e.g., topic) during a long sentence generation, our proposed framework can often complete the sentence in a seemingly natural way -- a capability that has never been attempted by previous methods.
이 논문을 읽어서 무엇을 배울 수 있는지 알려주세요! 🤔
기존의 latent variable 기반의 VAE로 style transfer 가 잘 되지 않았던 이유에 대해 분명히 알 수 있습니다.
위 문제를 해결할 수 있는 실질적인 방안에 대해서 알 수 있습니다.
같이 읽어보면 좋을 만한 글이나 이슈가 있을까요?
본 논문에서 제시한 방법이나, 이론적 배경은 뚜렸하나 본 논문은 ICLR 2020 에서 reject 되었습니다.
리뷰된 내용을 보면, paper 의 detail 이 부족했던 면을 주요한 reject 사인으로 언급하고 있습니다.
때문에 accept 되진 못했지만, 우리가 이 논문을 통해 얻을 수 있는 insight 는 충분하다고 생각합니다.
(review 에서도 본 논문은 border line 에 있는 논문이라고 이야기 하였습니다)
어떤 내용의 논문인가요? 👋
why previous VAEs on text cannot learn controllable latent representation as on images, as well as a fix to enable the first success towards controlled text generation without supervision
Abstract (요약) 🕵🏻♂️
The variational autoencoder (VAE) has found success in modelling the manifold of natural images on certain datasets, allowing meaningful images to be generated while interpolating or extrapolating in the latent code space, but it is unclear whether similar capabilities are feasible for text considering its discrete nature. In this work, we investigate the reason why unsupervised learning of controllable representations fails for text. We find that traditional sequence VAEs can learn disentangled representations through their latent codes to some extent, but they often fail to properly decode when the latent factor is being manipulated, because the manipulated codes often land in holes or vacant regions in the aggregated posterior latent space, which the decoding network is not trained to process. Both as a validation of the explanation and as a fix to the problem, we propose to constrain the posterior mean to a learned probability simplex, and performs manipulation within this simplex. Our proposed method mitigates the latent vacancy problem and achieves the first success in unsupervised learning of controllable representations for text. Empirically, our method significantly outperforms unsupervised baselines and is competitive with strong supervised approaches on text style transfer. Furthermore, when switching the latent factor (e.g., topic) during a long sentence generation, our proposed framework can often complete the sentence in a seemingly natural way -- a capability that has never been attempted by previous methods.
이 논문을 읽어서 무엇을 배울 수 있는지 알려주세요! 🤔
같이 읽어보면 좋을 만한 글이나 이슈가 있을까요?
본 논문에서 제시한 방법이나, 이론적 배경은 뚜렸하나 본 논문은 ICLR 2020 에서 reject 되었습니다. 리뷰된 내용을 보면, paper 의 detail 이 부족했던 면을 주요한 reject 사인으로 언급하고 있습니다. 때문에 accept 되진 못했지만, 우리가 이 논문을 통해 얻을 수 있는 insight 는 충분하다고 생각합니다. (review 에서도 본 논문은 border line 에 있는 논문이라고 이야기 하였습니다)
레퍼런스의 URL을 알려주세요! 🔗
https://openreview.net/forum?id=Hkex2a4FPr