Closed rafiyajaved closed 4 years ago
Hi,
We haven't yet used a Transformer-based model as the decoder to generate controlled text. You should be able to use a transformer-based decoder or even a pretrained transformer-based decoder like the GPT-2. Switching the entire Encoder-Decoder network for a Transformer based network might not work great because the vanila Decoder by itself is not a generative model.
Thank you for your response (and also, thank you for your PR adding the text style transfer example to PyTorch!). I had a follow-up question, you said "Switching the entire Encoder-Decoder network for a Transformer based network might not work great because the vanilla Decoder by itself is not a generative model." Could you elaborate what you mean when you say the decoder of a transformer can't be used as a generative model? I thought that presumably you could give the decoder of a Transformer-based network presumably a randomly generated code + previous token and generate new tokens.
You're right. Sorry, I think you can replace the architecture and use it as a style transfer too. I think there was a paper in IJCAI that used it for story completion. https://www.ijcai.org/proceedings/2019/0727.pdf
Awesome - thank you for this link.
Hey all, this is a bit of a naive question, but I'm using your Controlled Generation of Text/Text Style Transfer code to do some style transfer, and I was wondering if some of the generated text would improve if the repo was using a Transformer-based model as the decoder. Have you already tried this, and if so, were there any difficulties or unexpected issues with that approach?