houqp / leptess

Productive and safe Rust binding for leptonica and tesseract
https://houqp.github.io/leptess/leptess/index.html
MIT License
258 stars 28 forks source link

[Feature Request] Provide examples: preprocessing with leptonica / capi #59

Open karpfediem opened 7 months ago

karpfediem commented 7 months ago

Hi, this is both question and request for more documentation: How do I actually use leptonica to preprocess the image?

In none of the examples, and in none of the projects dependency-linked by GitHub did I manage to find any actual usage of leptonica / capi functions. The only leptonica function i found in use is the pix_read wrapper function.

I'm struggling to understand how one is supposed to use the capi together with the leptess api. This is mainly due to the different Pix types

Here is a simple example to demonstrate what I am confused about:

#[cfg(test)]
mod tests {
    use leptess::{capi, leptonica, tesseract};
    use std::path::Path;

    #[test]
    fn ocr() {
        let tessdata_path = Some("/usr/share/tesseract-ocr/5/tessdata");
        let image_path = Path::new("./tests/test.png");

        let mut api = tesseract::TessApi::new(tessdata_path, "eng").unwrap();
        let mut pix = leptonica::pix_read(image_path).unwrap(); // pix Struct is leptess::leptonica::Pix

        let mut pix_scaled = capi::pixScaleSmooth(pix, 2.0, 2.0); // capi methods require *mut leptess::capi::Pix

        api.set_image(&pix_scaled); // Needs another conversion back to leptess::leptonica::Pix Struct here ?

        println!("Text: {}", api.get_utf8_text().unwrap());
    }
}

Running this simple example results in mismatched types error:

error[E0308]: mismatched types
     --> src/lib.rs:14:51
      |
14    |         let mut pix_scaled = capi::pixScaleSmooth(pix, 2.0, 2.0); // capi methods require *mut leptess::capi::Pix
      |                              -------------------- ^^^ expected `*mut Pix`, found `Pix`
      |                              |
      |                              arguments to this function are incorrect
      |
      = note: expected raw pointer `*mut leptess::capi::Pix`
                      found struct `leptess::leptonica::Pix`

Please excuse me for any oversights on my part if there are any. Regardless, I think it would be valuable to add examples on how to use the capi properly.

ccouzens commented 7 months ago

Hello,

I'm looking into this, and will try and get the test added after a few modifications.

It's quite early here, so please forgive any mistakes.

The first hurdle is that pixScaleSmooth (c source, rustdoc) takes a * mut pointer. And leptess uses a reference count for pix. We cannot get exclusive access to it from the reference count, so I'll probably skip leptess for reading and scaling.

I'll continue this later.

ccouzens commented 7 months ago

I'm struggling to understand how one is supposed to use the capi together with the leptess api.

Yeah, I don't think it was possible when you raised this issue. I had to make 2 changes in leptonica-plumbing and am making the following change here:

https://github.com/houqp/leptess/pull/60

karpfediem commented 7 months ago

Yeah I was suspecting as much. Thank you!