antimatter15 / tesseract-rs

Rust bindings for Tesseract
MIT License
141 stars 31 forks source link

Expose underlying `TessBaseAPI` #43

Open StripedMonkey opened 1 year ago

StripedMonkey commented 1 year ago

I'm currently messing around tesseract in rust and have noticed there's a handful of functions that don't actually make it into the API currently, i.e TessBaseAPIPrintVariables. (unless I'm totally blind, which is a possibility)

While obviously I'd love for these to be added to the safe API, I think an easier solution in the shorter term would be to expose the tesseract_sys::TessBaseAPI, potentially as an unsafe get_/as_, to allow me to call said functions from the sys crate while still maintaining something of a safe high-level API for the things that do have safe wrappers.

I've currently implemented an as_raw() for both tesseract-rs and tesseract-plumbing in order to get to the raw pointer myself, but I think it's a useful thing in general.

Admittedly, I'm not super familiar with FFI/wrapper crate conventions, so this might be out of line. I'm happy to create a PR and contribute if necessary!

ccouzens commented 1 year ago

Hey,

I think the simplest thing to do is use pub around the struct fields in here and tesseract-plumbing. Having access to the underlying tesseract-sys pointer is totally safe.

The unsafety comes from calling c-functions on the tesseract-sys pointer or dereferencing it. Rust only allows the pointer to be de-referenced in an unsafe block. And all the c-functions in tesseract-sys are already declared unsafe.

I've currently implemented an as_raw() for both tesseract-rs and tesseract-plumbing in order to get to the raw pointer myself, but I think it's a useful thing in general.

Yes, it's neater to use methods than struct access.

The only issue I see with as_raw is it's unclear to me if to go from tesseract_rs::Tesseract to *mut tesseract_sys::TessBaseAPI you'd do as_raw() or as_raw().as_raw(). Do you skip past tesseract_plumbing or not?

There is the as_ref trait https://doc.rust-lang.org/std/convert/trait.AsRef.html. You can technically define it twice for tesseract_rs::Tesseract like this impl AsRef<plumbing::TessBaseApi> for tesseract_rs::Tesseract and impl AsRef<*mut tesseract_sys::TessBaseAPI> for tesseract_rs::Tesseract. The only issue with doing that is you need to be explicit about which as_ref you're calling. If I remember right this would work let pointer: *mut tesseract_sys::TessBaseAPI = tesseract.as_ref().

Kind regards,

Chris

StripedMonkey commented 1 year ago

For the moment I'm skipping past plumbing, as I have only need for the base ffi functions, and I don't really know what plumbing provides. This is probably a mistake on my part, but I'm very much learning tesseract in general

On Sun, Jun 25, 2023, 6:58 AM Chris Couzens @.***> wrote:

Hey,

I think the simplest thing to do is use pub around the struct fields in here and tesseract-plumbing. Having access to the underlying tesseract-sys pointer is totally safe.

The unsafety comes from calling c-functions on the tesseract-sys pointer or dereferencing it. Rust only allows the pointer to be de-referenced in an unsafe block. And all the c-functions in tesseract-sys are already declared unsafe.

I've currently implemented an as_raw() for both tesseract-rs and tesseract-plumbing in order to get to the raw pointer myself, but I think it's a useful thing in general.

Yes, it's neater to use methods than struct access.

The only issue I see with as_raw is it's unclear to me if to go from tesseract_rs::Tesseract to *mut tesseract_sys::TessBaseAPI you'd do as_raw() or as_raw().as_raw(). Do you skip past tesseract_plumbing or not?

There is the as_ref trait https://doc.rust-lang.org/std/convert/trait.AsRef.html. You can technically define it twice for tesseract_rs::Tesseract like this impl AsRef for tesseract_rs::Tesseract and impl AsRef<mut tesseract_sys::TessBaseAPI> for tesseract_rs::Tesseract. The only issue with doing that is you need to be explicit about which as_ref you're calling. If I remember right this would work let pointer: mut tesseract_sys::TessBaseAPI = tesseract.as_ref().

Kind regards,

Chris

— Reply to this email directly, view it on GitHub https://github.com/antimatter15/tesseract-rs/issues/43#issuecomment-1606025803, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5DMRCYXGKSDGCK7P6E6GTXNAKTVANCNFSM6AAAAAAZSV67WI . You are receiving this because you authored the thread.Message ID: @.***>