YingqingHe / ScaleCrafter

[ICLR 2024 Spotlight] Official implementation of ScaleCrafter for higher-resolution visual generation at inference time.
487 stars 29 forks source link

img2img functionality? #13

Open Gitterman69 opened 11 months ago

Gitterman69 commented 11 months ago

I saw the scripts are there but there is no documentation...

NeoAnthropocene commented 11 months ago

You can use controlnet with that i2i.py.

For example the first image created by Dall-e 3 and the second image creqted by using as a referance with Canny controlnet.

00001-upscaled-913383686856075-1.png

downloadfile.jpg

Gitterman69 commented 11 months ago

You can use controlnet with that i2i.py.

For example the first image created by Dall-e 3 and the second image creqted by using as a referance with Canny controlnet.

00001-upscaled-913383686856075-1.png

downloadfile.jpg

can you provide a code snippet please??? 🥺

NeoAnthropocene commented 11 months ago

can you provide a code snippet please??? 🥺

Sure, but I made a mistake and wrote it as i2i.py instead of text2image_xl_controlnet.py. Before run the "text2image_xl_controlnet.py" script; you need to install opencv to your virtual environment. Otherwise it will give you error because of the missing "cv2".

First be sure you're in the venv: conda activate scalecrafter

I'm running this on Windows WSL, but want to show you the other options too.

👉 For Windows if you have Anaconda installed: pip install opencv-python or conda install -c conda-forge opencv

👉 if you are on linux you can do : pip install opencv-python or conda install opencv

After you installed the opencv on your venv then you can run the script below:

python3 text2image_xl_controlnet.py \
--pretrained_model_name_or_path stablediffusionapi/juggernaut-xl-v5 \
--validation_prompt "3D render of an adorable skeleton family in their cozy house, embellished with Halloween trinkets like carved pumpkins and draped cobwebs, eagerly anticipating the celebration. ultrarealistic photorealistic, raytracing, subsurface scattering, shadow blending, ultra-detail, cinematic" \
--seed 92183840 \
--config ./configs/sdxl_1792x1024.yaml \
--logging_dir ./outputs \
--image_path ./CN/cute-skeleton-family.png \
--controlnet_model_name_or_path diffusers/controlnet-canny-sdxl-1.0-mid

ℹ️ Notes:

Please let me know if that helps. Have fun!

Gitterman69 commented 11 months ago

can you provide a code snippet please??? 🥺

Sure, but I made a mistake and wrote it as i2i.py instead of text2image_xl_controlnet.py. Before run the "text2image_xl_controlnet.py" script; you need to install opencv to your virtual environment. Otherwise it will give you error because of the missing "cv2".

First be sure you're in the venv: conda activate scalecrafter

I'm running this on Windows WSL, but want to show you the other options too.

👉 For Windows if you have Anaconda installed: pip install opencv-python or conda install -c conda-forge opencv

👉 if you are on linux you can do : pip install opencv-python or conda install opencv

After you installed the opencv on your venv then you can run the script below:

python3 text2image_xl_controlnet.py \
--pretrained_model_name_or_path stablediffusionapi/juggernaut-xl-v5 \
--validation_prompt "3D render of an adorable skeleton family in their cozy house, embellished with Halloween trinkets like carved pumpkins and draped cobwebs, eagerly anticipating the celebration. ultrarealistic photorealistic, raytracing, subsurface scattering, shadow blending, ultra-detail, cinematic" \
--seed 92183840 \
--config ./configs/sdxl_1792x1024.yaml \
--logging_dir ./outputs \
--image_path ./CN/cute-skeleton-family.png \
--controlnet_model_name_or_path diffusers/controlnet-canny-sdxl-1.0-mid

ℹ️ Notes:

* I have a reference image for ControlNet under the `CN` folder names as `cute-skeleton-family.png`.

* For the first time running this script it will load both models `stablediffusionapi/juggernaut-xl-v5` and `diffusers/controlnet-canny-sdxl-1.0-mid`. Give it some time.

* If you want the 6:19 like aspect ratio like I did here (sdxl_1792x1024.yaml); please refer to [this topic](https://github.com/YingqingHe/ScaleCrafter/issues/14).

Please let me know if that helps. Have fun!

thanks so much for your valuable inputs. its running but not working properly yet!

if i try: python -u -B -W ignore text2image_xl_controlnet.py --pretrained_model_name_or_path stablediffusionapi/juggernaut-xl-v5 --validation_prompt "a zombie" --seed 1374245672 --image_path "D:\Scarlett Johansson 2048x2048.png" --config .\configs\sdxl_2048x2048.yaml --logging_dir logs --controlnet_model_name_or_path diffusers/controlnet-canny-sdxl-1.0-mid

scarlet = init // zombie = the img2img...

Is there a strength option so it is more like the source? Can source be a different size to output? The settings for the /assets/dilate_settings/sdxl_2048x2048.txt files are different, so what about when sizes change to non square?

your help is highly appreciated! :) 1

Scarlett_Johansson_2048x2048
Gitterman69 commented 11 months ago

The ideal setup would be to have a --width and --height parameters passed to the script. Then the script creates the yaml files as needed. Then all the user needs to do is tell it what size the final image should be, the script does the rest resizing the init image etc.

NeoAnthropocene commented 11 months ago

Is there a strength option so it is more like the source? Can source be a different size to output? The settings for the /assets/dilate_settings/sdxl_2048x2048.txt files are different, so what about when sizes change to non square?

  1. I couldn't see in the python script any way to fine tune the controlnet. Yes we should have these options to if you ask :) Also we need to test it before sending to the render.
  2. I don't have any idea what will happen if you use different sizes with from the rendered output. Expect possible bad results.
  3. You can see it from my examples, they are non-square and they are fine. I would recommend to check my previous topic about it. I hope I understand your question right.
NeoAnthropocene commented 11 months ago

The ideal setup would be to have a --width and --height parameters passed to the script. Then the script creates the yaml files as needed. Then all the user needs to do is tell it what size the final image should be, the script does the rest resizing the init image etc.

My humble suggestion is to let the script detect the size of the image automatically - not us :)

heorhiikalaichev commented 9 months ago

But how to use a image2image_controlnet.py script? There is no requested config files.