robertknight / ocrs

Rust library and CLI tool for OCR (extracting text from images)
Apache License 2.0
1.21k stars 54 forks source link

Panic in layout analysis when recognizing image #19

Closed Y0ungSea closed 9 months ago

Y0ungSea commented 9 months ago

Using the code below to identify the image "5624.png," I found an unwarp causing panic.

5624.png: 5624

use std::error::Error;
use std::fs;

use ocrs::{OcrEngine, OcrEngineParams};
use rten::Model;
use rten_imageio::read_image;

fn init_ocr_engine() -> Result<OcrEngine, Box<dyn Error>>{
    let detection_model_data = fs::read("text-detection.rten")?;
    let rec_model_data = fs::read("text-recognition.rten")?;

    let detection_model = Model::load(&detection_model_data)?;
    let recognition_model = Model::load(&rec_model_data)?;

    let engine = OcrEngine::new(OcrEngineParams{
        detection_model: Some(detection_model),
        recognition_model: Some(recognition_model),
        ..Default::default()
    })?;

    Ok(engine)
}

fn ocr(engine: &OcrEngine, img_path: &str) -> Result<String, Box<dyn Error>>{
    let img = read_image(img_path)?;
    let ocr_input = engine.prepare_input(img.view())?;
    let text = engine.get_text(&ocr_input)?;
    Ok(text)
}

fn main() -> Result<(), Box<dyn Error>>{
    let code = ocr(&engine, "./5624.png")?;
    println!("{:?}", code);
    Ok(())
}

Cargo tell me:

thread 'main' panicked at C:\Users\Esperanza\.cargo\registry\src\index.crates.io-6f17d22bba15001f\ocrs-0.3.1\src\layout_analysis.rs:146:30:
called `Result::unwrap()` on an `Err` value: TryFromIntError(())
robertknight commented 9 months ago

Thanks for the bug report. The annotated output (run with --png -o annotated.png). Shows that one line is found with two words, and the words slightly overlap (this is possible due to post-processing of detections). The average inter-word spacing is negative and this tripped up the layout analysis.

Annotated input