krennic999 / STAR

STAR: Scale-wise Text-to-image generation via Auto-Regressive representations
https://krennic999.github.io/STAR/
92 stars 1 forks source link

Using T5 as a text encoder? #4

Open kl2004 opened 2 weeks ago

kl2004 commented 2 weeks ago

Thank you for sharing the paper. The results look very impressive! I wonder if using the T5 text encoder helps improve prompt relevancy compared to using CLIP. Have you tried it yet?

krennic999 commented 1 week ago

Thanks for advise, we possibly will replace CLIP and explore better text injection methods such as T5 in the subsequent versions.