visual results - Githubissues

bcmi / Object-Shadow-Generation-Dataset-DESOBAv2

[CVPR 2024] The dataset, code, and model for our paper "Shadow Generation for Composite Image Using Diffusion Model", CVPR, 2024.

Apache License 2.0

89 stars 8 forks source link

visual results #8

Open GeGehub opened 1 month ago

GeGehub commented 1 month ago

Can you provide your visual results on the test set? I tried to infer your set but found very bad results. Thanks

ustcnewly commented 1 month ago

You can use our released model to get the test results. For small objects or objects with simple shapes, the generated shadows are usually acceptable. For large objects or objects with complex shapes, the results are often unsatisfactory, but you can sample multiple times and get a relatively good result.

You can also try the shadow generation function in our libcom toolbox https://github.com/bcmi/libcom.

GeGehub commented 1 month ago

Thanks for your reply. The above figure is the results I tested. I just wondering if these results are reasonable? In addition, I saw that in your test set, most images contain only one object. In this case, How does your model learn the intensity through the intensity encoder?

GeGehub commented 1 month ago

By the way, as your dataset contains both 256 resolution and hr, what is the used resolution for training and testing?

Narumi-Maria commented 1 month ago

Thanks for your reply. The above figure is the results I tested. I just wondering if these results are reasonable? In addition, I saw that in your test set, most images contain only one object. In this case, How does your model learn the intensity through the intensity encoder?

The current results are normal. In some cases, only the inserted foreground in the background image needs to have synthesized shadows. In this situation, there is no issue of inconsistent shadow intensity between the background and foreground objects, so there is no need to learn the shadow intensity of the background. Additionally, we have considered that some users may not be able to obtain the shadow mask of the background objects, so this situation has been simulated in the test set.

Narumi-Maria commented 1 month ago

By the way, as your dataset contains both 256 resolution and hr, what is the used resolution for training and testing?

To align with previous work, we conduct testing and training at a resolution of 256.

GeGehub commented 1 month ago

I tried to re-train your model without modifying any code, but the model seems barely converged. The test result is shown in above.

Narumi-Maria commented 1 month ago

I tried to re-train your model without modifying any code, but the model seems barely converged. The test result is shown in above.

you may need to load pre-trained weights for controllnet

GeGehub commented 3 weeks ago

Could you please add this code?
Which weights should I load? Only the Controlnet or the whole Diffusion UNet?

Narumi-Maria commented 3 weeks ago

model = create_model('./models/cldm_v15.yaml').cpu() model_weight = load_state_dict(resume_path, location='cpu', strict=False) rgb_weight = model_weight['control_model.input_hint_block.0.weight'] gray_weight = 0.2989 rgb_weight[:, 0:1, :, :] + 0.5870 rgb_weight[:, 1:2,: ,:] + 0.1140 * rgb_weight[:, 2:3,: ,:] model_weight['control_model.input_hint_block.0.weight'] = torch.cat([rgb_weight, gray_weight], dim=1)

You can download the file "control_sd15_ini.ckpt" from the URL "https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/main" and use the local path where you store "control_sd15_ini.ckpt" as the resume_path.