Closed simonw closed 7 months ago
I tried it:
diff --git a/ocr.html b/ocr.html
index 3e4a177..d487c75 100644
--- a/ocr.html
+++ b/ocr.html
@@ -341,7 +341,7 @@ async function convertPDFToImages(file) {
async function ocrImage(worker, imageUrl) {
const {
data: { text },
- } = await worker.recognize(imageUrl);
+ } = await worker.recognize(imageUrl, {rotateAuto: true});
return { text };
}
But it didn't seem to work:
Tesseract.js has a not-very-well documented option:
https://github.com/naptha/tesseract.js/blob/03f82eaab57d3c7c852c6e61bfd805c8cf42e8f2/src/index.d.ts#L96-L102
Found an example here: https://github.com/naptha/tesseract.js/blob/03f82eaab57d3c7c852c6e61bfd805c8cf42e8f2/examples/browser/image-processing.html