mit-han-lab / fastcomposer

[IJCV] FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
https://fastcomposer.mit.edu
MIT License
644 stars 36 forks source link

The quality does not come out as good as the example. #6

Closed SlZeroth closed 1 year ago

SlZeroth commented 1 year ago

I used https://github.com/mit-han-lab/fastcomposer to generate a two shot image by inserting two images, but it is difficult to get a successful picture. Is there anything I am mistaken or need to change?

tianweiy commented 1 year ago

the most important parameter is the alpha. For instance, the following set of images (from celeb-a test set) are all generated with alpha 0.6, all the other hyper-parameter are the same (20 step text only + 30 step subject-conditioned).

multiple

Maybe tune that hyperparameter a bit more. Additionally, due to the small scale of FFHQ, some prompts are probably just impossible as they are too far away from the distribution.

You can find the whole test set prediction at url

JPW0080 commented 1 year ago

Try changing the model in fastcomposer/fastcomposer/utils.py to something other than the default runwayml/stable-diffusion-1-5 icbinp_v6 non-cherry picked default setting results with source images that were 172x172 Japanese woodblock print prompting did not cooperate with the aforementioned model.

A man <A> and a man <A> in the snow https://ibb.co/0V766qZ https://ibb.co/Gnds0xW https://ibb.co/3rqHLkm https://ibb.co/Cbh52NP https://ibb.co/B33kfqG https://ibb.co/bHTGdDg https://ibb.co/vDh7wtH https://ibb.co/9cBTg1Q

A man <A> and a woman <A> in the snow https://ibb.co/pnZPLXp https://ibb.co/GRNQd0D https://ibb.co/MhgbRtm https://ibb.co/m4bR87M https://ibb.co/dDz6CbD https://ibb.co/qWXdqqt https://ibb.co/kJHNdTP https://ibb.co/G2Hb2ZF

A man <A> and a woman <A> water color painting https://ibb.co/z6g9yHh https://ibb.co/FWDYjTn https://ibb.co/4mgFNcx https://ibb.co/sPq8SQM https://ibb.co/80cbf9z https://ibb.co/19DYvLD https://ibb.co/mDXt3NG https://ibb.co/Hzm6LRf

A man <A> and a woman <A> standing in the garden https://ibb.co/t8jDFnQ https://ibb.co/bzBLTVt https://ibb.co/svkGT8V https://ibb.co/FxrTwMY https://ibb.co/jTKvQq3 https://ibb.co/rMzyzjq https://ibb.co/KmPgm4T https://ibb.co/p3cp7MY

A woman <A> and a woman <A> gardening, backyard https://ibb.co/GH53wX5 https://ibb.co/5rY5wgk https://ibb.co/9YFyRgS https://ibb.co/yNNkqjP https://ibb.co/Dwy5smy https://ibb.co/xDpD1k4 https://ibb.co/rQh6ZGN https://ibb.co/tc0Fwtj

A woman <A> and a woman <A> in the jungle https://ibb.co/741hMCF https://ibb.co/vhC9wLp https://ibb.co/8rGfWgf https://ibb.co/R9YWGSW https://ibb.co/DQNF4KH https://ibb.co/b1PVjVs https://ibb.co/TMZWG8t https://ibb.co/0DpCgPW

tianweiy commented 1 year ago

Japanese woodblock print prompting did not cooperate with the aforementioned model.

It probably needs smaller alpha. Some times this results in dissimilar figures but it works reasonably robust (you get at least 1-2 good example among 4).

The following is with default settings (and 0.6 alpha)

subject_0000_prompt_0006_instance_0000 subject_0001_prompt_0006_instance_0000 subject_0001_prompt_0006_instance_0003 subject_0002_prompt_0006_instance_0000 subject_0002_prompt_0006_instance_0002 subject_0003_prompt_0006_instance_0001

full list of predictions on test could be found at URL

NaokiSato102 commented 1 year ago

Could you explain the method to change the model in more detail? I tried using arguments that seemed appropriate, but it didn't work.

--pretrained_model_name_or_path "D:\Programs_D\_MachineLearning\StableDiffusionModels\Online\CheckPoint\anything-v4.0.ckpt"

OSError: It looks like the config file at 'D:\Programs_D\_MachineLearning\StableDiffusionModels\Online\CheckPoint\anything-v4.0.ckpt' is not a valid JSON file.
JPW0080 commented 1 year ago

The script expects diffusers model, https://huggingface.co/spaces/diffusers/sd-to-diffusers

The sd-to-diffusers appears to be broken atm so it is time to do some hoopjumping in the stable-diffusion-webui environment Using Command Prompt, Activate the env that contains stable-diffusion-webui X:\stable-diffusion-webui\venv\Scripts\activate.bat pip list make note of any installed diffusers version, although when stable-diffusion-webui is started again, diffusers should be reinstalled if required.

pip install -U git+https://github.com/huggingface/diffusers

Download - https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-inference.yaml - Right click and save as Place v1-inference.yaml into X:\stable-diffusion-webui\venv\Lib\site-packages\diffusers Download https://github.com/huggingface/diffusers/archive/refs/heads/main.zip Find the scripts folder within the zip, now drag and drop all the files into X:\stable-diffusion-webui\venv\Lib\site-packages\diffusers

Go back to env activated Command Prompt cd X:\stable-diffusion-webui\venv\Lib\site-packages\diffusers

python convert_original_stable_diffusion_to_diffusers.py --checkpoint_path=X:/stable-diffusion-webui/models/Stable-diffusion/disneyPixarCartoon_v10.safetensors --safetensors --scheduler_type=ddim --dump_path=U:/test/anything-4.5 --original_config_file=X:/stable-diffusion-webui/venv/Lib/site-packages/diffusers/v1-inference.yaml

When finished with conversion(s), if required, reinstall the original diffusers version

pip install diffusers==0.16.1