yobix-ai / extractous

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
Apache License 2.0
416 stars 17 forks source link

Stall when extracting using ocr on macos from pdf with embedded images #23

Open nmammeri opened 2 weeks ago

nmammeri commented 2 weeks ago

Initial investigation points to the bad handling of AWT within GraalVM . I think it's due to the AWT dispose thread deadlocking.

nmammeri commented 2 weeks ago

This bug is only experienced with extract to stream api on macos and not extract to string