ali-vilab / Cones-V2

MIT License
502 stars 18 forks source link

Training on people images #10

Open rezkanas opened 6 months ago

rezkanas commented 6 months ago

Hello,

Thank you for your excellent work. I'm eager to give it a try.

I plan to utilize it for training on images of people. However, upon reviewing your paper, I didn't find any mention of this use case, nor was it listed in your limitations section. Can your method be applied to images of people?

If so, I have a concern regarding the color context. From what I understand, for each concept, I need to specify an RGB color. I'm unsure how this would translate in the context of images of people. Could you provide some insight on this matter?

@Johanan528 @alibaba-oss @zyf0619sjtu

rezkanas commented 5 months ago

@Johanan528 @alibaba-oss @zyf0619sjtu I continue to work on experimenting with your method. I trained two tokens on images of 2 persons for 4000 steps and during inference I used guidance_config.json

[
    {
        "prompt": "znrz person and nsnn person, an unlikely duo, in the midst of a vibrant village life, their contrasting personalities evident, candid moment, Watercolor, 8K resolution.",
        "residual_dict": {
            "znrz": "/home/anasrezklinux/anas_april/cones_2/znrz/residual.pt",
            "nsnn": "/home/anasrezklinux/anas_april/cones_2/nsnn/residual.pt"
        },
        "color_context": {
            "0,255,0": [
                "znrz",
                2.5
            ],
            "255,0,0": [
                "nsnn",
                2.5
            ]
        },
        "guidance_steps": 50,
        "guidance_weight": 0.08,
        "weight_negative": -100000000.0,
        "layout": "/home/anasrezklinux/anas_april/prompt_layout/prompt_2_ID_39f26be4-6016-457d-8cfc-7e7e3d6f76fa.png",
        "subject_list": [
            [
                "znrz",
                1
            ],
            [
                "nsnn",
                4
            ]
        ]
    }
]

produced the following image:

0

I am suspecting that it is because of color_context part. otherwise, please advise.