robertknight / ocrs

Rust library and CLI tool for OCR (extracting text from images)
Apache License 2.0
1.09k stars 44 forks source link

Wasm / frontend support #84

Open Isaac-Leonard opened 3 months ago

Isaac-Leonard commented 3 months ago

I see that there is a nodejs example in the repo but is it possible to use this on the frontend yet with ts support?

robertknight commented 3 months ago

There is a WASM build, but you have to build from source as it isn't published to eg. npm yet. To try this out locally:

  1. Clone repository and run make wasm
  2. You can try out the JS + WASM demo in https://github.com/robertknight/ocrs/tree/main/js/examples/ocr-node

You can also build for WASI and run with eg. wasmtime. See https://github.com/robertknight/ocrs/blob/fe6de19db01ea5aaf4a4e891e21c74bb58394277/Makefile#L46.

To manage expectations, I will warn you that the WASM build is currently much slower than the native build. This is for two reasons:

  1. The native build is multi-threaded, the WASM build is not
  2. Native SIMD is much faster than WASM SIMD
Isaac-Leonard commented 3 months ago

Okay I've got it working now, I'm not too worried about speed and I'm getting far more useful results than I was getting for tesseract.js so I'm pretty happy. Should I leave this open till it's available on npm?

robertknight commented 3 months ago

Should I leave this open till it's available on npm?

Yes, I think so. There isn't an existing tracking issue for publishing an npm package.