Closed vlinh128 closed 3 months ago
No, that doesn't make a lot of sense. One normally wants to replace something eith another thing, so if you only would provide the AI the target prompt it could not select what to detect properly. What is your use ase for this and why do you think this would be beneficial to have?
https://github.com/lllyasviel/Fooocus/discussions/3345#discussioncomment-10122908
Thank you for your quick response!
I agree that having only one target prompt is more likely to result in incorrect substitutions. Still, it would be great to have a detect mask segmentation layer based on the target prompt, like chatGPT-4o, if the image has a person wearing a t-shirt, and jeans, the prompt is "wear a red dress", then this layer can automatically select a t-shirt and jeans for replacement.
@vlinh128 Fooocus doesn't support natural language input prompts, which is why "wear a dress" will not work as there is no reference which parts of the image have to be changed. This currently can't be done.
Is there an existing issue for this?
What would your feature do?
Version 2.5.0 includes a great mask segmentation feature, but currently, I need to go through two steps to generate new images for the "inpaint outpaint" model. The first step requires a "prompt_mask_text" to detect mask segmentations, and then the second step requires a "prompt_image_text" to generate the images. I wonder if it's possible to use just the "prompt_image_text" for both steps?
Proposed workflow
Additional information
No response