Closed Shreeshrii closed 7 years ago
Please add the Arabic comma too, (،) U+060C.
Any idea of when the eastern-arabic numerals will be added to the language packs?
Added to my local copy for next round of training. Then I will push updated langdata as well.
@theraysmith
I hope you have seen other comments regarding using only persian number range for persian and arabic range for Arabic.
Yes, I hope the experts also see my question about the Arabic languages not mentioned by those issues (kur_ara, pus, uig).
On Mon, Aug 7, 2017 at 6:38 PM, Shreeshrii notifications@github.com wrote:
@theraysmith https://github.com/theraysmith
I hope you have seen other comments regarding using only persian number range for persian and arabic range for Arabic.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tesseract-ocr/langdata/issues/71#issuecomment-320825657, or mute the thread https://github.com/notifications/unsubscribe-auth/AL056VBYghCZM2R0sOYmSVFtcDC26cPeks5sV7whgaJpZM4NN0lN .
-- Ray.
Hi guys , I'm using tesseract 4 I'm using ara.traineddata to extract the text from the image. it's working well for the letters but numbers is not good at all . From the comment above there should be some other traineddata for only numbers . any body can guide me where to find it .
thank a lot
It seems that Ray didn't push the data to our side (langdata_lstm and best/fast repos).
This issue should be re-opened.
Please see https://github.com/tesseract-ocr/tesseract/issues/858
include both 0-9 and ( ٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩) for Arabic.