lllyasviel / ControlNet

Let us control diffusion models!
Apache License 2.0
28.97k stars 2.62k forks source link

About dataset #588

Open Maeyon-Z opened 7 months ago

Maeyon-Z commented 7 months ago

Should I follow the following steps to preprocess my dataset with images of varying sizes:

  1. Resize the image to the same size, such as 512 * 512

  2. After resizing, input the image into the annotator to generate control img

  3. Save the resized image and control img as training data

geroldmeisinger commented 7 months ago

yes, you can do that but resizing everything to square squishes aspect ratios, see here https://civitai.com/articles/2078/play-in-control-controlnet-training-setup-guide#heading-35441

Maeyon-Z commented 7 months ago

thanks

Maeyon-Z commented 7 months ago

yes, you can do that but resizing everything to square squishes aspect ratios, see here https://civitai.com/articles/2078/play-in-control-controlnet-training-setup-guide#heading-35441

I have read the fantastic document you wrote, but my English is not very good. After using translation software to translate some sentences, I couldn't quite understand them. After reading it, I thought that for all images of different sizes, we should first resize the shorter side to 512, synchronize the longer side to another size, and then crop a 512 * 512 image in the center as the result. Am I right in understanding this way

geroldmeisinger commented 7 months ago

yes. at least, it's the most foolproof way and will give good results.

you can also crop and resize it to 512x512 anywhere you want, if you have more information about your image dataset. for example if you work with facial images and know where the faces are, you might want to crop around the faces instead of the middle.

ControlNet is also able to handle images of size 512x(n*64) but you have to look that up in your training script.

Maeyon-Z commented 7 months ago

yes. at least, it's the most foolproof way and will give good results.

you can also crop and resize it to 512x512 anywhere you want, if you have more information about your image dataset. for example if you work with facial images and know where the faces are, you might want to crop around the faces instead of the middle.

ControlNet is also able to handle images of size 512x(n*64) but you have to look that up in your training script.

i understand, thank you very much

Namn23 commented 6 months ago

yes. at least, it's the most foolproof way and will give good results.

you can also crop and resize it to 512x512 anywhere you want, if you have more information about your image dataset. for example if you work with facial images and know where the faces are, you might want to crop around the faces instead of the middle.

ControlNet is also able to handle images of size 512x(n*64) but you have to look that up in your training script.

My original image size is 256x128,it is about a pedestrian and i need whole image,i plan to resize it to 512x256,should i change something or somewhere in the config.yaml file,what is the meaning of the params image_size:64 ,and i didn't see the n of n*64