Open LukeWood opened 2 years ago
What is the scope of this? Are we waiting for a model release from Google Research?
just opening to triage - added a label for tracking.
We have completed text_to_image, I think the next should be img2img. Closing this request for now
@tanzhenyu (cc. @LukeWood )
We have completed text_to_image,
Actually, text-guided image generation can be categorized one of variant of text-to-image. It's not the same. In text guided image generation, there would be an sample input image too.
See this for more details. 👉 https://github.com/huggingface/diffusers/issues/1254
@tanzhenyu (cc. @LukeWood )
We have completed text_to_image,
Actually, text-guided image generation can be categorized one of variant of text-to-image. It's not the same. In text guided image generation, there would be an sample input image too.
See this for more details. 👉 huggingface/diffusers#1254
Oh I see, so that's the img2img I was referring to then. I am working on it, so re-opening this
@tanzhenyu (cc. @miguelCalado) Here is another interesting variant of image-t-image with text guided. Placing it in case you're interested to take it.
Paper: Null-Text Inversion for Editing Real Images by Google Original code: in PyTorch (o_O) TF 2 code: https://github.com/miguelCalado/prompt-to-prompt-tensorflow (uses keras-cv)
Hi!
Thank you for referring to my implementation of the Prompt-to-Prompt paper @innat. I would be happy to do a PR of the code (after some refactoring) if you guys want 😊 It is a cool method and a useful tool to have in the arsenal when dealing with cross-attention injection, which seems kinda popular these days.
But since the discussion is around text-guided image generation, why don't you start by adding negative prompting? It seems to be useful, especially when dealing with SD 2.x.
Hi!
Thank you for referring to my implementation of the Prompt-to-Prompt paper @innat. I would be happy to do a PR of the code (after some refactoring) if you guys want 😊 It is a cool method and a useful tool to have in the arsenal when dealing with cross-attention injection, which seems kinda popular these days.
But since the discussion is around text-guided image generation, why don't you start by adding negative prompting? It seems to be useful, especially when dealing with SD 2.x.
That would be great! Do you want to start with negative prompting, or should I? (I have been busy with the 0.4 release so this might take me 2 weeks)
Sure! It would be my pleasure!
I opened an issue #1206 for further discussion.
@miguelCalado Congrats for the 1st place of keras community price. 👍
keras community price.
Congrats @miguelCalado !! We really appreciate your work. If you have other ideas to improve our existing offering, please go ahead!
Thanks everyone for the wishes! I'm still in shock and this whole thing hasn't really settled down :sweat_smile:
Yes, I'm looking forward to contributing some more. It would be cool to see implemented in KerasCV multi-prompting - adding weights to parts of the prompts, different schedulers, more versions of stable diffusion (e.g. 1.5), other research works (e.g. image variation or Imagic), etc. There is a lot of room for contributions :grin:
But one PR at a time! The Prompt-to-prompt will take me a bit as I'm still giving the final touches (adding support for multiple batches and other small things :slightly_smiling_face:).
Thanks everyone!
Continuing the thread of text-guided image generation, this work also looks interesting: "it refines the cross-attention units to attend to all subject tokens in the text prompt and strengthen - or excite - their activations, encouraging the model to generate all subjects described in the text prompt.". The video summarizes it pretty well
It appears simple to implement (no training/finetuning), and it could work as a parameter on the text_to_image
method (e.g. apply_excite_tokens=[2,5]
). Might have a go after being done with the Prompt-to-prompt PR!
Thank you @Elvenson! This had been deprioritized but we'll take a look at your code.
https://twitter.com/_akhaliq/status/1582175757153230849?s=21&t=rCRiXt4-XW41JIyx-jwc-Q