tesseract-ocr / tesstrain

Train Tesseract LSTM with make
Apache License 2.0
637 stars 188 forks source link

Box file generated with same coordinate while using "make training" command, is it correct? #88

Closed 24121987 closed 5 years ago

24121987 commented 5 years ago

Hi,

We are creating dataset, in the dataset we have put image(.jpg) file & text file and then we have using "make training" command for training, Trasining has been completed successfully, But there is only concern about ".box" file Box file had generated same coordinate.

Below attached the zip file (.box & tif file),Please look both of the file. ..box file has all same coordinates..I don't understand is correct or not?

files.zip

Below are the .box file coordinates.

H 0 0 348 29 e 0 0 348 29 0 0 348 29 k 0 0 348 29 n 0 0 348 29 e 0 0 348 29 w 0 0 348 29 0 0 348 29 t 0 0 348 29 h 0 0 348 29 e 0 0 348 29 0 0 348 29 r 0 0 348 29 u 0 0 348 29 l 0 0 348 29 e 0 0 348 29 s 0 0 348 29 , 0 0 348 29 0 0 348 29 h 0 0 348 29 e 0 0 348 29 0 0 348 29 b 0 0 348 29 r 0 0 348 29 o 0 0 348 29 k 0 0 348 29 e 0 0 348 29 0 0 348 29 t 0 0 348 29 h 0 0 348 29 e 0 0 348 29 m 0 0 348 29 . 0 0 348 29 348 29 349 30

Every file genarting same coordinates using "make training" command..inside the "generate_line_box.py" file is running to generate .box and tif file.

kba commented 5 years ago

Yes, that is to be expected. C.f. https://github.com/tesseract-ocr/tesstrain/issues/25 and https://github.com/tesseract-ocr/tesstrain/issues/32 for previous discussions on this.

24121987 commented 5 years ago

thanks, i got the answer.