jonathanpalma / react-native-tesseract-ocr

Tesseract OCR wrapper for React Native
MIT License
561 stars 172 forks source link

Scanned text with proper formatting #83

Closed mitesh-db closed 4 years ago

mitesh-db commented 4 years ago

Hi @jonathanpalma Is there any way to get data with proper formatting like space and tab-spaces. The following image shows how my scanned data looks like...

But If I wanted spaces too, Is there anything I can implement?

Screenshot_2020-07-22-10-34-26-318_com bill_scan

jonathanpalma commented 4 years ago

@mitjnextt sure, you can use recognizeTokens, and provide the token level you want to recognize (LEVEL_BLOCK | LEVEL_LINE | LEVEL_PARAGRAPH | LEVEL_SYMBOL | LEVEL_WORD). This method will return an array of Tokens

type Bounding = {
    bottom: number;
    left: number;
    right: number;
    top: number;
};

type Token = {
    token: string;
    confidence: number;
    bounding: Bounding;
};
jonathanpalma commented 4 years ago

Hi @mitjnextt I wanted to follow up on this to check if what I proposed was useful for you?

mitesh-db commented 4 years ago

Hello @jonathanpalma ,

As you suggested that this has not been implemented for iOS so I have never tried with above code, but I appreciated your help/reply for that code.

After that my colleague has worked on the demo with different native code, so then, we did not used any repo.