deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.06k stars 648 forks source link

text recognition model recognizes the repeated text #2034

Open wangpf09 opened 1 year ago

wangpf09 commented 1 year ago

Description

(A clear and concise description of what the bug is.)

When I use djl 0.18.0 version for paddleocr prediction, the text recognition model always recognizes the repeated text, but the model does not have such problems in python code.

Expected Behavior

(what's the expected behavior?)

Error Message

(Paste the complete error message, including stack trace.)

How to Reproduce?

(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)

Steps to reproduce

(Paste the commands you ran that produced the error.)

1. 2.

What have you tried to solve it?

  1. I tried changing PaddleOCR other models, same problem

Environment Info

Please run the command ./gradlew debugEnv from the root directory of DJL (if necessary, clone DJL first). It will output information about your system, environment, and installation that can help us debug your issue. Paste the output of the command below:

windows10/11 jdk11

PASTE OUTPUT HERE
siddvenk commented 1 year ago

@wangpf09 can you provide some more information please? Specifically the following will be helpful so we can better understand the issue:

wangpf09 commented 1 year ago

@wangpf09 can you provide some more information please? Specifically the following will be helpful so we can better understand the issue:

  • How can we reproduce this issue? Are you using your own code, and if so can you share an example we can use to reproduce the issue? Are you using one of the built in DJL examples of jupyter notebooks?
  • You mention that you don't see the issue in python code - can you also share the python example so we can compare the two?
  • You mention that you have tried using other models and still see the same issue - can you list what other models you have tried? Are these models in the DJL model zoo, or are you loading your own model?

i train my own mode using this code PaddleOCR . my own develop code by djl was here djl-ai. i upload my model to huggingface i think this is all you need to reproduce this issue

frankfliu commented 1 year ago

@wangpf09

your github project contains several modules, would you please add some steps that run your project and how to reproduce the issue.

Is there a unitest code that can show the expected result?

wangpf09 commented 1 year ago

@wangpf09

your github project contains several modules, would you please add some steps that run your project and how to reproduce the issue.

Is there a unitest code that can show the expected result?

I have uploaded the running instructions in the readme

mymagicpower commented 1 year ago

Try this one. https://github.com/mymagicpower/AIAS/blob/main/1_image_sdks/text_recognition/ocr_sdk/

wangpf09 commented 1 year ago

Try this one. https://github.com/mymagicpower/AIAS/blob/main/1_image_sdks/text_recognition/ocr_sdk/

PpWordRecognitionTranslator.java

In this method processOutput has such a piece of code

boolean[] selection = new boolean[indices.length];
        Arrays.fill(selection, true);
        for (int i = 1; i < indices.length; i++) {
            if (indices[i] == indices[i - 1]) {
                selection[i] = false;
            }
        }

which can indeed solve the existing problem. At that time, I think this problem should really exist, and this solution is not the best solution.

mymagicpower commented 1 year ago

Official(the best) solution which is based on the model algorithm/design/output(natively consider this issue). https://github.com/PaddlePaddle/PaddleOCR/blob/ef9e870204/ppocr/postprocess/rec_postprocess.py#L457

wangpf09 commented 1 year ago

Official(the best) solution which is based on the model algorithm/design/output(natively consider this issue). https://github.com/PaddlePaddle/PaddleOCR/blob/ef9e870204/ppocr/postprocess/rec_postprocess.py#L457

I tried this method to solve this problem. Although it can work on some recognition results, the overall recognition rate is still lower than the python version of the model.

mymagicpower commented 1 year ago

https://github.com/mymagicpower/AIAS/blob/main/1_image_sdks/text_recognition/ocr_sdk/src/main/java/me/aias/example/OcrV3RecognitionExample.java