Alpha-VLLM / Lumina-mGPT

Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
https://arxiv.org/abs/2408.02657
498 stars 21 forks source link

problem with Image editing #16

Open chkmook opened 2 months ago

chkmook commented 2 months ago

I read the technical report and observed the impressive performance of the t2i generation using the provided demo code. There is an issue with the image editing process using demo_image2image.py. I tried to edit an image generated by t2i generation, it doesn't work. Even after changing the CFG, there is no change. Could you please explain if there are hyperparameters to perform i2i editing?

Here is the result of editing:

Screenshot 2024-08-21 at 11 18 45 AM
ChrisLiu6 commented 2 months ago

We do observe that for the editing task, the model has a strong inclination to keep the original image unchanged. I think it might be related to the small number of used editing data and the feature of the editing task per. se., namely during training, the model is supervised to keep the very most regions unchanged, making the gradient of the editing part swamped. Currently, for the editing task we suggest trying multiple times with different seeds (usually 5 is enough) and some of the trials should work

chkmook commented 2 months ago

Thank you for your response. Then, could you provide an example of an image and prompt that can be easily edited?