wincentbalin / pytesstrain

Python tools for Tesseract OCR training
https://pypi.org/project/pytesstrain/
Apache License 2.0
25 stars 7 forks source link

create_ground_truth: does not show error #4

Closed bertsky closed 2 years ago

bertsky commented 2 years ago

When running with font selectors that cannot be resolved, I just get:

2022-03-29 18:41:29,424 INFO     Processing .txt files
2022-03-29 18:41:33,352 INFO     Generating .tif files
2022-03-29 18:41:34,831 INFO     Done

Unfortunately, there is no loglevel switch. But going to DEBUG in the code, I can only see Generating messages, nothing further.

By instrumenting some more, I can see that the problem is https://github.com/wincentbalin/pytesstrain/blob/b6a85dec3a02b878f8cee7d8170a75e7dabaeca6/pytesstrain/text2image/pytext2image.py#L57

I can make a log message show me the problem (Could not find font named 'LOV.UnicodeFraktur'. Pango suggested font 'FreeMono'. Please correct --font arg).

But really what happened to the exception?

wincentbalin commented 2 years ago

The exception does not propagate from the forked process to the main one, or so it seems. I will look into it in a couple of days.

wincentbalin commented 2 years ago

I've added the error logging in the recent commits. Could you reinstall the module pytesstrain from this repository and rerun your task, please?

bertsky commented 2 years ago

I just did – and :tada: it does work now:

2022-04-04 12:13:52,548 ERROR    subprocess error: Could not find font named 'LOV.UnicodeFraktur'. Pango suggested font 'FreeMono'. Please correct --font arg.

(The overall retval is still 0, but that's okay with me, hard not to see all the error messages now.)

Thanks a lot @wincentbalin!

(You might want to make a new release, too...)