tobran / DF-GAN

[CVPR2022 oral] A Simple and Effective Baseline for Text-to-Image Synthesis
Other
297 stars 67 forks source link

Text Embeddings #17

Closed sabeehalam closed 2 years ago

sabeehalam commented 2 years ago

How did you form the text encoder. Also what are the contents of meta data files?

tobran commented 2 years ago

The text encoder is a BiLSTM that has been jointly trained through DAMSM loss [1] as many GAN-based models. You can also try other text encoders (Bert, CLIP).

[1] Xu T, Zhang P, Huang Q, et al. Attngan: Fine-grained text to image generation with attentional generative adversarial networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 1316-1324.

sabhiram6 commented 11 months ago

How did you form the text encoder. Also what are the contents of meta data files?

Hey did u try solving with different encoders?