oven-sh / bun

Incredibly fast JavaScript runtime, bundler, test runner, and package manager – all in one
https://bun.sh
Other
71.77k stars 2.55k forks source link

tesseract.js recognize() uber slow #11350

Open mefistofelix opened 1 month ago

mefistofelix commented 1 month ago

What version of Bun is running?

1.1.10-canary.1+c689b2b26

What platform is your computer?

Microsoft Windows NT 10.0.22631.0 x64

What steps can reproduce the bug?

import ocr from 'tesseract.js'
let ocr_s = ocr.createScheduler()
let ocr_w = await ocr.createWorker('eng',1,{
  logger: function(m){console.log(m.progress)}
})
ocr_s.addWorker(ocr_w)
//x.jpg is 64kb large
let ocr_res = await ocr_s.addJob('recognize','x.jpg')
console.dir(ocr_res)
process.exit()

What is the expected behavior?

work fast as nodejs

What do you see instead?

it takes some minutes on a decent laptop, nodejs requires just 1 second

Additional information

No response

Lawlzer commented 4 weeks ago

Also happening for me. ~70s in Bun, ~3s in Node. Here's a slightly easier test (that runs in Node and Bun, for easy comparison)

// import { createWorker } from 'tesseract.js';
const { createWorker } = require('tesseract.js');

(async () => {
    const start = Date.now();
    const worker = await createWorker('eng', 1, {
        logger: (m) => {
            console.log(m);
        },
    });
    const ret = await worker.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png');
    console.log(ret.data.text);
    await worker.terminate();

    console.log(`total time taken: ${Date.now() - start}m`);
})();