Anyone figured out how to handle long text sequences

PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

https://paddlepaddle.github.io/PaddleOCR/

Apache License 2.0

42.44k stars 7.66k forks source link

Anyone figured out how to handle long text sequences #11482

Closed bely66 closed 3 months ago

bely66 commented 8 months ago

All the recognition models can't handle long sequences

For some reason, it gets a bit better however it never matches the accuracy of single words

and most of the papers say they can handle long sentences with high accuracy

I tried all the models (CRNN, SAR, ABINET, SVTR)

Has anyone figured it out, yet? is there any trick with the data preprocessing? Does anyone have some ideas?

Thanks everyone

bhavyajoshi-mahindra commented 8 months ago

I am also looking for the same answer. My paddleocrV4 finetuned model is able to correctly recognize single words but is not able to recognize long sequence of words in a image.