moatifbutt / color-peel

we propose to generate a series of geometric shapes with target colors to disentangle (or peel off ) the target colors from the shapes. By jointly learning on multiple color-shape images, we found that the method can successfully disentangle the color and shape concepts.
https://moatifbutt.github.io/colorpeel/
48 stars 3 forks source link

Not getting correct results #3

Closed salon2525 closed 2 weeks ago

salon2525 commented 2 weeks ago

Hello,

It seems my prompt are not always producing colors as expected for e.g. "a kid wearing sweater in <c1> and <c4> color" will only produce correct results 3 or 4 times out of 12. Is it possible that instances_3d.json file don't have enough training data examples and I need to expand it?

Also, is it possible to give the model an image, let's say image of kid with sweater and then ask it to add c1 and c4 color to it instead of using text prompt.

Thanks in advance.

wangkai930418 commented 2 weeks ago

Hi,

For T2I generation and text-based image editing, it is sometimes a lottery from my experience. 3/12 has been a good possibility. To make it even easier, we have to seek help from text-image alignment or inversion techniques for T2I generation, like attend-and-excite, dynamic prompt learning (NeurIPS24), etc.

Best, Kai.

salon2525 commented 2 weeks ago

Hi @wangkai930418 , thanks a lot for your response. I appreciate it. What would be best way to reproduce results as shown in paper? If I understood correctly, it seems paper shows the result images are produced using T2I, is that not the case? Were the images shown in paper selectively picked from 3 out of 12 output that were produced? Thanks in advance.

wangkai930418 commented 2 weeks ago

Yes, for sure, we have to show cherry-picking results in the paper to make sure the color is really covering the right place. We are using SD version 1 as you are using.

If you would like to improve the successful rate, the easiest way is to filter the generation process by Civitai positive and negative prompts. Another way is to consider the T2I alignment and inversion approaches.

Best, Kai.