robertknight / ocrs

Rust library and CLI tool for OCR (extracting text from images)
Apache License 2.0
1.09k stars 44 forks source link

Possibility to replace existing recognition model to onnx models exported from other projects? #55

Closed lieding closed 3 months ago

lieding commented 4 months ago

Hello, this is a great repository for OCR project. But I find this project now only recognizes English characters, but now I target french text, which is very poor with current recogniton model, maybe. So I consider if it is possible to introduce ONNX model for recognition tasks, detection is the same?

robertknight commented 4 months ago

It is indeed possible to swap out the recognition model while keeping the detection model. When using the CLI, the models for different tasks can be specified using the --detect-model and --rec-model flags.

The basic process would be:

  1. Find or train recognition model (see ocrs-models repo for details of the current models)
  2. Export recognition model to ONNX
  3. Convert ONNX model to .rten format using rten-convert
  4. Specify --rec-model flag when calling ocrs CLI (or when configuring OcrEngine if you're using the library)

In order for the model to work, the "API" of the model (input shape, output shape and pre/post-processing) needs to be the same. This is where you'll run into issues if you try to take an ONNX model from some unrelated project and try to use it today, as those projects probably do different preprocessing.

As long as the API of another project's models is not too different than how the current models work, I think it should be possible to add some flexibility to this project so you could use them though.