Closed royudev closed 4 years ago
How are you running the training?
I ran it after a fresh install and it worked fine. See attached log.
git clone https://github.com/tesseract-ocr/tesstrain.git
cd tesstrain
mkdir data
unzip /home/ubuntu/tesstrain/ocrd-testset.zip -d data/foo-ground-truth
nohup make training &
i did the same step you did
git clone https://github.com/tesseract-ocr/tesstrain.git
cd tesstrain
mkdir data
Downloaded the zip file and unzipped it in the data/foo-ground-truth
make training
@royudev I cannot reproduce your error. Could you please run the exact steps which @Shreeshrii proposed and post the file nohup.out.txt
here? In addition, please run ls -l
on data
and on data/foo-ground-truth
.
Hi @wrznr I was able to make it work, I've attached the nohup.out.txt nohup.out.txt
here's the ls -l
of data
total 256
drwxrwxrwx 1 vagrant vagrant 262144 Mar 13 18:01 foo-ground-truth
weird thing is when i use only 1 .tif file and it's equivalent .gt.txt in the foo-ground-truth (for example: alexis_ruhe01_1852_0018_022.gt.txt and alexis_ruhe01_1852_0018_022.tif) i always get the Error: missing ground truth for training
You need lines for training as well as evaluation. The default ratio is 9:1 (I think). So, use at least 10 lines of text and image pairs.
Hi @wrznr I was able to make it work, I've attached the nohup.out.txt nohup.out.txt
here's the
ls -l
ofdata
total 256 drwxrwxrwx 1 vagrant vagrant 262144 Mar 13 18:01 foo-ground-truth
weird thing is when i use only 1 .tif file and it's equivalent .gt.txt in the foo-ground-truth (for example: alexis_ruhe01_1852_0018_022.gt.txt and alexis_ruhe01_1852_0018_022.tif) i always get the
Error: missing ground truth for training
Hi @royudev I have the same error, how did u fix this?
@atuanbk58 Pls. see the answer by @Shreeshrii: It is not fixable. You will need at least two lines GT when setting the ration to 1:1 or ten lines with the default ratio of 9:1 between training and test set. However, both scenarios will not result in usable models. For training OCR, you will need several hundreds of GT lines.
I've followed the instructions on how to train images but I keep on getting this error
It keeps on showing this error
Error: missing ground truth for training
command i used make training
the image and ground truth text are from the same repo
ocrd-testset.zip
what could possibly the solution to fix this?