VectorSpaceLab / OmniGen

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
MIT License
1.46k stars 73 forks source link

OmniGen would be more useful if it only editted images instead of regenerating them #74

Open fluthru opened 6 hours ago

fluthru commented 6 hours ago

This model is very interesting and I hope the idea catches on and we keep seeing more development to this concept. My biggest problem with using OmniGen is that it doesn't really edit images. If I upload an image made in Flux and ask it to edit an element, it will generate a new image with the requested edits (well, it'll try). That image will be, obviously, at a much lower quality. The same applies to character consistency. I can't give it an image and tell it to replace the character with another, because it'll regenerate the whole image.

That means I cannot find a use case for this compared to simply inpainting with another model at a much higher fidelity and quality. This is specially true because OG can only make small changes while keeping the overall consistency of the original image.

deeplearn-art commented 5 hours ago

As for me, I was hoping to generate first and last frames for videos. It would be wonderful if I could bring storyboard drawings to life like this: 1.png - storyboard drawing of character in profile 2.png - storyboard drawing where character has turned her head to view the camera X.png - image of a woman seen in profile

Then with input ["1.png","2.png","X.png"], use a few-shot prompt

_according to the following examples: input <|image_1|>, output <|image_2|>. Generate an output for the input <|image3|>

Sadly, it doesn't work. I wonder if it could be done with some extra training?

staoxiao commented 5 hours ago

This model is very interesting and I hope the idea catches on and we keep seeing more development to this concept. My biggest problem with using OmniGen is that it doesn't really edit images. If I upload an image made in Flux and ask it to edit an element, it will generate a new image with the requested edits (well, it'll try). That image will be, obviously, at a much lower quality. The same applies to character consistency. I can't give it an image and tell it to replace the character with another, because it'll regenerate the whole image.

That means I cannot find a use case for this compared to simply inpainting with another model at a much higher fidelity and quality. This is specially true because OG can only make small changes while keeping the overall consistency of the original image.

Thank you very much for your suggestion! OmniGen is the first attempt at universal image generation. We want to further optimize its performance in the future. And hope it can inspire more user-friendly unified models to replace the current complex workflows.

staoxiao commented 5 hours ago

@deeplearn-art , this is a very interesting scenario! but sorry, the current OmniGen cannot handle it. You can try fine-tuning the model; we've released fine-tuning scripts: https://github.com/VectorSpaceLab/OmniGen/blob/main/docs/fine-tuning.md. Once you have your data ready, you can start the fine-tuning process. Feel free to open an issue if you have any problems about fine-tuning.