Latency too high with DCNv3_pytorch op on cpu

OpenGVLab / InternImage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

https://arxiv.org/abs/2211.05778

MIT License

2.47k stars 231 forks source link

Latency too high with DCNv3_pytorch op on cpu #261

Open xbkaishui opened 9 months ago

xbkaishui commented 9 months ago

Hi guys:

when use DCNv3_pytorch for inference, the letency is too high for cpu only device I test with both cuda and cpu infer for cuda device , per image is 29 ms, with DCNV3 version the cost is 20ms for cpu device, per image is 550ms （only test with pytorch）

do you have any idea to optimize the cpu infer cost？

thanks