liming-ai / ControlNet_Plus_Plus

Official PyTorch implementation of ECCV 2024 Paper: ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback.
https://liming-ai.github.io/ControlNet_Plus_Plus
Apache License 2.0
334 stars 15 forks source link

Openpose model with hand/face #5

Open huchenlei opened 3 months ago

huchenlei commented 3 months ago

Hi folks,

Awesome work on improving the control map alignment! I think the openpose full control type should benefit the most from this new work as previously the control on hand/face were not so good. Is there any plan to train such an openpose model soon?

liming-ai commented 3 months ago

Hi folks,

Awesome work on improving the control map alignment! I think the openpose full control type should benefit the most from this new work as previously the control on hand/face were not so good. Is there any plan to train such an openpose model soon?

Thank you for pointing out this question. We are working hard to support more conditions (including 2D human pose) and train SDXL version of the model, please pay attention to subsequent updates!

huchenlei commented 3 months ago

Hi folks, Awesome work on improving the control map alignment! I think the openpose full control type should benefit the most from this new work as previously the control on hand/face were not so good. Is there any plan to train such an openpose model soon?

Thank you for pointing out this question. We are working hard to support more conditions (including 2D human pose) and train SDXL version of the model, please pay attention to subsequent updates!

I think one more thing to mention is that for openpose we can probably use difference between detected openpose json as reward/loss function, as the detected map often cannot show enough details on hand skeletons keypoints / face keypoints.

liming-ai commented 3 months ago

I think one more thing to mention is that for openpose we can probably use difference between detected openpose json as reward/loss function, as the detected map often cannot show enough details on hand skeletons keypoints / face keypoints.

A great addition. What do you think of heatmap? As far as I know, some methods try to turn 2D pose into multi-dimension heatmap to calculate MSE loss. What do you think about such a paradigm and whether it is also possible cannot show enough details?

huchenlei commented 3 months ago

The heatmap method should work as long as it has more channels per pixel than RGB to reflect move of a keypoint. The main purpose is just to make the loss function smoother, as some small movement of hand/face keypoint might not get reflected on rendered control map.

liming-ai commented 3 months ago

The heatmap method should work as long as it has more channels per pixel than RGB to reflect move of a keypoint. The main purpose is just to make the loss function smoother, as some small movement of hand/face keypoint might not get reflected on rendered control map.

Thanks for the advice and explanation, I will try this in the near future! Looking forward to further discussions with you!