cvat-ai / cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
https://cvat.ai
MIT License
11.73k stars 2.88k forks source link

Perhaps Paddleocr can be integrated for ocr detection and transcription #7130

Open KTXKIKI opened 7 months ago

KTXKIKI commented 7 months ago

Actions before raising this issue

Is your feature request related to a problem? Please describe.

https://github.com/PaddlePaddle/PaddleOCR paddlepaddle-GPU环境组件.zip PaddleOCR.zip This is done by running script comments locally and uploading them, which may be integrated CVAT

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

bsekachev commented 3 months ago

Hello, looks interesting. Does it work on CPU as well? What is FPS performance? Generally, it would be great to add a good model for OCR, but actually there are not resources to do it now.

KTXKIKI commented 3 months ago

你好,看起来很有趣。它也适用于 CPU 吗?什么是 FPS 性能?一般来说,为 OCR 添加一个好的模型会很棒,但实际上现在没有资源来做。

of course olso support CPU and GPU

KTXKIKI commented 3 months ago

https://github.com/PaddlePaddle/PaddleOCR

KTXKIKI commented 1 month ago

你好,看起来很有趣。它也适用于 CPU 吗?什么是 FPS 性能?一般来说,为 OCR 添加一个好的模型会很棒,但实际上现在没有资源来做。

cpu inference for hundreds of milliseconds gpu inference Decades of milliseconds

KTXKIKI commented 2 weeks ago

Paddleocr is a universal ocr model that supports both Chinese and English as well as various special symbols I found some dockers to create image files for it: https://github.com/PaddlePaddle/PaddleOCR/tree/main/deploy/docker I think the development of the main.py file should become easier

KTXKIKI commented 2 weeks ago

I think we can learn from it:https://github.com/cvat-ai/cvat/tree/develop/serverless/openvino/omz/intel/face-detection-0205/nuclio