Open GeGehub opened 1 month ago
You can use our released model to get the test results. For small objects or objects with simple shapes, the generated shadows are usually acceptable. For large objects or objects with complex shapes, the results are often unsatisfactory, but you can sample multiple times and get a relatively good result.
You can also try the shadow generation function in our libcom toolbox https://github.com/bcmi/libcom.
Thanks for your reply. The above figure is the results I tested. I just wondering if these results are reasonable? In addition, I saw that in your test set, most images contain only one object. In this case, How does your model learn the intensity through the intensity encoder?
By the way, as your dataset contains both 256 resolution and hr, what is the used resolution for training and testing?
Thanks for your reply. The above figure is the results I tested. I just wondering if these results are reasonable? In addition, I saw that in your test set, most images contain only one object. In this case, How does your model learn the intensity through the intensity encoder?
The current results are normal. In some cases, only the inserted foreground in the background image needs to have synthesized shadows. In this situation, there is no issue of inconsistent shadow intensity between the background and foreground objects, so there is no need to learn the shadow intensity of the background. Additionally, we have considered that some users may not be able to obtain the shadow mask of the background objects, so this situation has been simulated in the test set.
By the way, as your dataset contains both 256 resolution and hr, what is the used resolution for training and testing?
To align with previous work, we conduct testing and training at a resolution of 256.
I tried to re-train your model without modifying any code, but the model seems barely converged. The test result is shown in above.
I tried to re-train your model without modifying any code, but the model seems barely converged. The test result is shown in above.
you may need to load pre-trained weights for controllnet
model = create_model('./models/cldm_v15.yaml').cpu() model_weight = load_state_dict(resume_path, location='cpu', strict=False) rgb_weight = model_weight['control_model.input_hint_block.0.weight'] gray_weight = 0.2989 rgb_weight[:, 0:1, :, :] + 0.5870 rgb_weight[:, 1:2,: ,:] + 0.1140 * rgb_weight[:, 2:3,: ,:] model_weight['control_model.input_hint_block.0.weight'] = torch.cat([rgb_weight, gray_weight], dim=1)
You can download the file "control_sd15_ini.ckpt" from the URL "https://huggingface.co/runwayml/stable-diffusion-v1-5/tree/main" and use the local path where you store "control_sd15_ini.ckpt" as the resume_path.
Can you provide your visual results on the test set? I tried to infer your set but found very bad results. Thanks