Disentangled Representation Learning for Non-Parallel Text Style Transfer

codertimo / paper-log

읽어야 하는 논문들을 관리하고, 읽은 논문들의 기록을 남기는 공간

31 stars 5 forks source link

어떤 내용의 논문인가요? 👋

language model 에서 얽혀있는 style 과 content 의 representation 을 분리하는 문제를 풀고자 하였습니다.

2가지 테스크인 style 예측 문제와, bag-of-words 예측 문제를 각각 풀고, multi-task와 adversarial objective를 이용해 학습하는 방법을 제시하였습니다. 이 방법이 간단하지만, 효율적인 방법임을 주장하였습니다.

제안하는 방법을 사용하여 결과적으로 latent space 상에서 얽혀있던 contents 와 style representation 이 확실하게 분리되는 것을 보여줍니다.

이렇게 분리된 representation 은 style transfer를 위한 non-parallel corpora에 적용할 수 있습니다.

이전 연구들에 비해서 transfer accuracy, content preservation, language fluency에서 높은 점수를 달성하였습니다.

Abstract (요약) 🕵🏻‍♂️

This paper tackles the problem of disentangling the latent representations of style and content in language models. We propose a simple yet effective approach, which incorporates auxiliary multi-task and adversarial objectives, for style prediction and bag-of-words prediction, respectively. We show, both qualitatively and quantitatively, that the style and content are indeed disentangled in the latent space. This disentangled latent representation learning can be applied to style transfer on non-parallel corpora. We achieve high performance in terms of transfer accuracy, content preservation, and language fluency, in comparison to various previous approaches.

이 논문을 읽어서 무엇을 배울 수 있는지 알려주세요! 🤔

language model 의 representation에서 style-transfer에 사용되는 representation이 되기 까지의 흐름을 이해하고 인사이트를 얻을 수 있습니다.

content와 style representation 의 얽힘에 대해서 보다 자세하게 이해할 수 있습니다.

얽혀있는 representation 을 각기 다른 task 를 통해 분리하는 방법에 대해 알 수 있습니다.

multi-task와 adversarial objective 를 동시에 사용하여 학습하는 방식에 대해 알 수 있습니다.

Motivation

본 논문에서는 보다 확실하게 content 와 style latent space 를 분리할 수 있는 method 를 제안합니다.
content 와 style 을 잘 분리해야, content preservation

Method

기본적으로 VAE를 사용하여서 문장을 reconstruction 하는 loss 를 학습에 사용합니다,.
기존의 VAE 와 다른 점은 Encoding 시에 Content, Style latent variable 를 분리한다는 점이며, decoding 시에는 이 두 분리된 variable 을 concat 하여 사용합니다.
이때 분리된 content 와 style latent varaiable 의 space 를 완벽하게 분리하기 위해서 서로 상호 보완적인 4가지 loss 를 설계합니다.

Flow

style representation : Style-Classification(s) - Advesarial(Style-Classification(c))
content representation : BoW(c) - Advesarial(BoW(s)

각 loss 설명

Style-Classification(s) : style latent variable 을 이용해서 이 문장이 어떤 style(pos, neg) 인지 구분합니다.
Advesarial(Style-Classification(c)) : content latent variable에 style 정보가 포함되지 않도록 만듭니다. content로 style classification 이 불가능 하도록 만듭니다.
BoW(c) : content latent variable 을 이용해서 BoW(Bag of Words)를 예측하게 합니다. content 정보가 더 잘 담길 수 있도록 만들어 주는 역할을 합니다.
Advesarial(BoW(s): style latent variable 에 content 정보가 포함되지 않도록 만들어 줍니다. content 로 BoW 를 예측할 수 없도록 만듭니다.

각 mini-batch 별로 학습 순서를 나누어서 학습합니다.

foreach mini-batch do
2 minimize Jdis(s)(θdis(s)) w.r.t. θdis(s);
3 minimize Jdis(c)(θdis(c)) w.r.t. θdis(c);
4 minimize Jovr w.r.t. θE, θD, θmul(s), θmul(c);
5 end

Experiment

Dataset : Yelp, Amazone Review Dataset (positive, negative style labeled)을 사용하였음
Evaluation:
- Style Transfer Accuracy(STA) : CNN으로 학습된 classifier 로 evaluation 진행함 (정확도가 높아서, Style Transfer 성능을 측정하는데, 신뢰할 수 있는 방법임.)
이외에도, PPL, Cosine Similarity, Word Overlap, Geometric Mean 과 같은 추가적인 Metric 을 사용함

Result

기존 연구들 보다 PPL, STA 에서 큰폭으로 높은 성능을 보여 주었음

codertimo / paper-log