eclipse-t2i / lambda-eclipse-inference

Official PyTorch implementation of "λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space"
https://eclipse-t2i.github.io/Lambda-ECLIPSE/
MIT License
43 stars 5 forks source link

Training code for lambda-ECLIPSE #2

Open j-min opened 5 months ago

j-min commented 5 months ago

Hi, thanks for sharing the inference code for lambda-ECLIPSE; nice work!

Do you plan to release the training code for the lambda-ECLIPSE? Such resource-efficient training would be very useful for many low-resource groups. I'll be looking forward to trying this.

Maitreyapatel commented 4 months ago

Hello @j-min, thanks for showing interest in our work.

TL;DR: We are building an end-to-end project that can help us utilize the true potential of CLIP models for T2I. But public release may take some time.

Long answer:

Current progress/hype in T2I is largely due to three works: SDXL-Turbo (inference time efficiency), DALL-E 3 (SOTA), and Playground v2.5 (highly aesthetic). Majority of the such works use cleaver tricks (like synthetic captions, DPO, etc.) which we haven't explored yet in ECLIPSE/UnCLIP settings. Hence, the true potential is still a mystery to us!!

Due to the efforts involved in building the end-to-end system, we are projecting the partial release in May. We are also looking for help/collaborations to speed up the process and explore its use cases in unknown territories. If you are interested please reach out to me at: maitreya.patel@asu.edu

Alternative: The training is pretty straightforward and uses a mix of contrastive learning and projection loss as described in the paper.