naptha / tesseract.js

Pure Javascript OCR for more than 100 Languages 📖🎉🖥
http://tesseract.projectnaptha.com/
Apache License 2.0
34.91k stars 2.21k forks source link

Issue with Tesseract.js OCR Integration in Angular Application #935

Closed btechengg24 closed 2 months ago

btechengg24 commented 3 months ago

I am currently working on integrating Tesseract.js for Optical Character Recognition (OCR) within an Angular application to convert images to text.

However, I have encountered an issue and seek your guidance to resolve it.

Here is a snippet of the code I am using:

import { Component } from '@angular/core';
import { FileUploadModule } from 'primeng/fileupload';
import { createWorker } from 'tesseract.js';

@Component({
  selector: 'app-invoicetotext',
  imports: [FileUploadModule],
  standalone: true,
  templateUrl: './invoicetotext.component.html',
  styleUrls: ['./invoicetotext.component.css'],
})
export class InvoicetotextComponent {
  async onFileSelect(event: any) {
    const file = event.files[0];

    if (file) {
      console.log('Selected file:', file);

      try {
        const path = `localpath`; // Replace with correct file path
        const worker = await createWorker('eng');
        const ret = await worker.recognize(path);

        console.log(ret);
        await worker.terminate();
      } catch (error) {
        console.error('Error processing OCR:', error);
      }
    } else {
      console.error('Please select a valid file.');
    }
  }

  onFileUpload(event: any) {
    console.log(event);
  }
}

When attempting to execute the OCR process using Tesseract.js with the specified local file path. I am encountering the following error:

image

Request for Assistance:

Despite reviewing and implementing various approaches, including verifying the file path and ensuring the Tesseract.js configuration is correct, the error persists. I would appreciate any insights or recommendations you may have regarding this issue.

Thank you for your time and assistance.

G Likhit Reddy btechengg24@gmail.com

Balearica commented 3 months ago

Tesseract.js supports both browsers and Node.js, however they have different entry points. It looks like you are attempting to use the Node.js code in a browser application.

For use in the browser, the relevant files are in the /dist directory. These are already included in the npm version, and if cloning this repo, they can be created by running npm run build. You would import createWorker from the file /dist/tesseract.esm.min.js.

btechengg24 commented 3 months ago

Thank you for responding to my previous query.

I have recently installed the Tesseract.js package in my Angular project and am seeking guidance on properly using the createWorker function. I have found various sources online, but I am finding it difficult to understand and implement the correct approach.

Could you please provide a detailed explanation or share the exact code needed to effectively use the createWorker function in an Angular component? As I am new to working with different packages and plugins, your assistance would be greatly appreciated.

Here is how I am currently importing and using the createWorker method:

import { createWorker } from 'tesseract.js';

const worker = await createWorker('eng');
const ret = await worker.recognize(file);

console.log(ret);
await worker.terminate();

Thank you for your support.

Balearica commented 3 months ago

I have not used Angular before, so example code for Angular would need to be contributed by an Angular user (maybe you if you figure this out). However, I believe core issue here was explained in my previous post.

My understanding is that Angular is a front-end framework, so your code is expected to be run in the browser. When you install Tesseract.js from npm, and add import { createWorker } from 'tesseract.js' to your code, you are using the Node.js version. This will not run in the browser.

The ESM browser version is built to the file /dist/tesseract.esm.min.js. Therefore, if you wanted to use ESM in the browser, you would do import { createWorker } from '/dist/tesseract.esm.min.js', except adjusting the file path to be correct based on your project setup.

Balearica commented 2 months ago

Let me know if there is any outstanding bug report or feature request here, otherwise will close the issue.

btechengg24 commented 2 months ago

I was unable to resolve the issue. Therefore, I will be closing it from my end.