seongminp / transformers-into-vaes

Code for "Finetuning Pretrained Transformers into Variational Autoencoders"
https://aclanthology.org/2021.insights-1.5
36 stars 11 forks source link

SAMPLE #2

Open WYejian opened 3 years ago

WYejian commented 3 years ago

Hello, thank you for sharing the code, how can I sample from the hidden space to generate ?

seongminp commented 3 years ago

HI @WYejian.

T5VAE defined in model_t5.py initializes an internal t5 (T5ForConditionalGeneration) defined in vendor_t5.py.

I've modified T5ForConditionalGeneration in vendor_t5.py so it takes a sampled_z parameter: https://github.com/seongminp/transformers-into-vaes/blob/16205c8da8731b0097d80eeca219a878e0397beb/vendor_t5.py#L46

Since we don't call T5ForConditionalGeneration directly (and instead interface with its wrapper, T5VAE), you can pass the sampled z as one of the "kwargs" in T5VAE's forward: https://github.com/seongminp/transformers-into-vaes/blob/16205c8da8731b0097d80eeca219a878e0397beb/model_t5.py#L70

I've not tested generation extensively with this code, though. But it should be the same as generating in any other encoder-decoder network.

Hope that helps!

WYejian commented 3 years ago

Thank you for your reply, when I run the model, it prompts 'NO module named 'generate'', what should I do? In addition, the pre-trained data and training data can be different?

seongminp commented 3 years ago

Just uploaded generate.py! Thanks for pointing that out.

Yes. I think it'll work better if we use the same data for pretraining and finetuning, but I wanted to work with datasets used in previous research. The training data is just raw text, so ideally the choice of finetuning dataset should not matter for performance but we all know in practice the domain shift in the corpus degrades benchmark performance.