Open 562888698 opened 6 months ago
Hi, thank you for coming to us! Could you please explain what do you mean by OCR annotations specifically? Can such annotations be achieved with masks, polygons, or bounding boxes? Can labels (per character?) or label attributes (such as arbitrary text) be helpful in this scenario?
I suppose we may start from a good model, returning rectangles and text attribute.
Hi, thank you for coming to us! Could you please explain what do you mean by OCR annotations specifically? Can such annotations be achieved with masks, polygons, or bounding boxes? Can labels (per character?) or label attributes (such as arbitrary text) be helpful in this scenario?
Per my understanding, I think what @562888698 is suggesting is performing OCR within an annotation bounding box / polygon. After a broad region containing text is selected through pre-existing annotation tools, OCR can be performed within said region to either extract the textual content as plaintext, or to automatically generate annotations around textual characters.
Hi, thank you for coming to us! Could you please explain what do you mean by OCR annotations specifically? Can such annotations be achieved with masks, polygons, or bounding boxes? Can labels (per character?) or label attributes (such as arbitrary text) be helpful in this scenario?
Hello, I also have this same need. I have successfully implemented the full OCR system and have managed to output my results within a Docker container. Just like the follow screen shot: And in my annotation task, I set a contribute "content", which type is "text", what can I do to write my recgnoize result "415" in the table. This is my returned result
我想我们可以从一个好的模型开始,返回矩形和文本属性。
hi @bsekachev may be like thins #7130 look my update .zip i think that model can sovle ocr detection and transcripition And it supports rectangle or polygon detection box return and text transcription in multiple languages
Hi, any progress? Has the method I provided been helpful?
Paddleocr is a universal ocr model that supports both Chinese and English as well as various special symbols I found some dockers to create image files for it: https://github.com/PaddlePaddle/PaddleOCR/tree/main/deploy/docker I think the development of the main.py file should become easier
I think we can learn from it :https://github.com/cvat-ai/cvat/tree/develop/serverless/openvino/omz/intel/face-detection-0205/nuclio
Hello Team,
I am an active user of CVAT and have been enjoying the platform's capabilities in annotating images and videos for various computer vision tasks. I recently came across a use case where it would be incredibly beneficial if CVAT could support Optical Character Recognition (OCR) annotations, specifically for scenarios such as recognizing and marking text within images, particularly for number plates on vehicles.
If there is already a way to achieve this through existing plugins or configurations, please let me know. Otherwise, I'd appreciate it if the team could consider adding this functionality in future releases.