Alpha-VLLM / Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation
MIT License
1.82k stars 74 forks source link

Lumina Next img2img #91

Closed thelemuet closed 3 days ago

thelemuet commented 1 week ago

Hello! I have been playing with the SFT model and I'm really impressed to far!

Are there any plans to implement img2img for your image models? For regular img2img usage like SD but also I get the feeling this model would work really well for upscaling/refining since it's already able to generate very consistent/coherent images at higher resolutions (unlike ie: SDXL at higher than 1024x1024).

gaopengpjlab commented 1 week ago

Can you suggest any popular image2image model? I will take a look at it. By the way, in lumina-t2x technique report, we illustrate that training-free image2image editing is possible.

kijai commented 1 week ago

I'm also very interested in img2img and curious how the showcased image editing was achieved? I have tried the basics of using encoded image instead of random noise/augmented with noise, but that doesn't seem to work. Or did you use reverse sampling?

thelemuet commented 1 week ago

Can you suggest any popular image2image model? I will take a look at it. By the way, in lumina-t2x technique report, we illustrate that training-free image2image editing is possible.

I believe all Stable Diffusion model can do img2img (have not tried SD3 but I assume it does).

Here is a quick example usage using SDXL. Starting from a simple image of a 3d render where I know want to keep the composition, lighting and colors. I would set it to generate 30 steps but skip a few steps depending on how much I want to retain from the original image and how much new details I want the model to generate. Here is what it looks like from original image and different results depending on starting step:

iti1

And an upscale example:

upscale1

gaopengpjlab commented 1 week ago

We will release a demo similar as stablediffusionimg2image pipeline.

https://huggingface.co/docs/diffusers/en/api/pipelines/stable_diffusion/img2img

zhuole1025 commented 4 days ago

Hi! We just update the code for img2img using Lumina next: https://github.com/Alpha-VLLM/Lumina-T2X/blob/main/lumina_next_t2i_mini/scripts/sample_img2img.sh Here is the the demo:

image
zhuole1025 commented 4 days ago

I'm also very interested in img2img and curious how the showcased image editing was achieved? I have tried the basics of using encoded image instead of random noise/augmented with noise, but that doesn't seem to work. Or did you use reverse sampling?

Our implementation is identical to this in the diffusers, but you have to be cautious with the start timestep which is affected by the time-shifting scale. Welcome to add this into ComfyUI and have some tests!

kijai commented 4 days ago

I'm also very interested in img2img and curious how the showcased image editing was achieved? I have tried the basics of using encoded image instead of random noise/augmented with noise, but that doesn't seem to work. Or did you use reverse sampling?

Our implementation is identical to this in the diffusers, but you have to be cautious with the start timestep which is affected by the time-shifting scale. Welcome to add this into ComfyUI and have some tests!

Got it. Thank you! image