Mikubill / sd-webui-controlnet

WebUI extension for ControlNet
GNU General Public License v3.0
16.95k stars 1.95k forks source link

[Feature Request]: Animal Pose Preprocessor (RTMPose AP-10K) #2258

Closed Fannovel16 closed 9 months ago

Fannovel16 commented 11 months ago

https://github.com/abehonest/ControlNet_AnimalPose DWPose but for animal The original implementation doesn't use a object detector so it can only recognize one animal in the image. From my test, YOLOX used by DWPose works pretty well. MMPose can be replaced with this onnx checkpoint

sdbds commented 11 months ago

https://github.com/Deci-AI/super-gradients/blob/master/YOLONAS.md I've actually only recently tried using yolo-nas instead of yolox, and yolo-nas-pose seems to be faster and more accurate than the current openpose. However my own testing of onnx converted to fp16 doesn't seem to be compatible with the current setup.

Fannovel16 commented 11 months ago

@sdbds Which model do you use? Small or medium?

sdbds commented 11 months ago

@sdbds Which model do you use? Small or medium?

dwpose use yolox_l,so i used yolo-nas-l fp16

Fannovel16 commented 11 months ago

@sdbds I wrote some code to support YOLO-NAS: https://github.com/Fannovel16/comfyui_controlnet_aux/blob/main/src/controlnet_aux/dwpose/yolo_nas.py#L25

It might work with this extension since both my extension and this one have similar code Also, it requires the exported onnx model to have built-in NMS

from super_gradients.conversion.conversion_enums import ExportQuantizationMode
from super_gradients.common.object_names import Models
from super_gradients.training import models

model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco")

export_result = model.export(
    "yolo_nas/yolo_nas_l_fp16.onnx",
    quantization_mode=ExportQuantizationMode.FP16,
    device="cuda"
)
sdbds commented 11 months ago

@sdbds I wrote some code to support YOLO-NAS: https://github.com/Fannovel16/comfyui_controlnet_aux/blob/main/src/controlnet_aux/dwpose/yolo_nas.py#L25

It might work with this extension since both my extension and this one have similar code Also, it requires the exported onnx model to have built-in NMS

from super_gradients.conversion.conversion_enums import ExportQuantizationMode
from super_gradients.common.object_names import Models
from super_gradients.training import models

model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco")

export_result = model.export(
    "yolo_nas/yolo_nas_l_fp16.onnx",
    quantization_mode=ExportQuantizationMode.FP16,
    device="cuda"
)

Thank U!I will test it soon.

Fannovel16 commented 11 months ago

The dtype has to be uint8 for both INT8 and FP16 models

sdbds commented 11 months ago

i found another animal pose estimation, i will compare them https://github.com/DeepLabCut/DeepLabCut

Fannovel16 commented 11 months ago

@sdbds DeepLabCut's keypoints aren't similar to the ones this ControlNet trained on but they look more accurate tbh.

sdbds commented 10 months ago

I separated the GPU part of the code and added a separate animalpose preprocesser. And i will train a SDXL controlnet lllite for it. image

avinashchakravarthi commented 9 months ago

I separated the GPU part of the code and added a separate animalpose preprocesser. And i will train a SDXL controlnet lllite for it. image

I would love to try "SDXL controlnet" for Animal openpose, pls let me know if you have released in public domain.

sdbds commented 9 months ago

I separated the GPU part of the code and added a separate animalpose preprocesser. And i will train a SDXL controlnet lllite for it. image

I would love to try "SDXL controlnet" for Animal openpose, pls let me know if you have released in public domain.

I've had a lot of development work lately, and I'm not trained for now

andreszs commented 6 months ago

I would love to try "SDXL controlnet" for Animal openpose, pls let me know if you have released in public domain.

Any news on this? Pose detection works well with the AP10K estimator, but there's no controlnet to apply it and SD1.5 models throw this error with SDXL checkpoints:

mat1 and mat2 shapes cannot be multiplied (308x2048 and 768x320)