About reference image size

ali-vilab / AnyDoor

Official implementations for paper: Anydoor: zero-shot object-level image customization

https://ali-vilab.github.io/AnyDoor-Page/

MIT License

4.01k stars 366 forks source link

About reference image size #93

Open szgy66 opened 6 months ago

szgy66 commented 6 months ago

Dear author, thank you for your contribution, I have achieved better results in many scenarios. But recently, I found that when my reference image was very small, about 50x50 in size, the generated effect on the target image was very poor, and even completely inconsistent with my reference image. May I ask why this is? Does anydoor support small-size reference images? Or is there a certain relationship between the mask size of the reference image and the background image to generate better results? Looking forward to your reply

XavierCHEN34 commented 6 months ago

First, make sure that you give the correct formats of reference masks. Our first step is using the reference mask to segment the specific reference object. Second, AnyDoor would resize the segmented object to 256x256, which fits the input size of DINOv2. So if your image is too small, the resized result would be blurry and vague.

You could attach some of your examples here and let me see whether the results are correct

szgy66 commented 6 months ago

Thank you for your reply. Due to github restrictions, I put the results on the Baidu webdisk(链接：https://pan.baidu.com/s/1pldgGZ1_rv0g9_qhrWlerQ?pwd=6r2i 提取码：6r2i), including the reference image, the background image, and the generated image, the reference image I obtained using segmentanything. From the generated image, the rectangular boundary of the background mask is obvious.