wincentbalin / pytesstrain

Python tools for Tesseract OCR training
https://pypi.org/project/pytesstrain/
Apache License 2.0
25 stars 7 forks source link

could not find font, despite installed #5

Closed bertsky closed 2 years ago

bertsky commented 2 years ago

I have…

fc-list | grep Fraktur
/usr/share/fonts/opentype/mathjax/MathJax_Fraktur-Bold.otf: MathJax_Fraktur:style=Bold
/usr/share/fonts/opentype/mathjax/MathJax_Fraktur-Regular.otf: MathJax_Fraktur:style=Regular
~.local/share/fonts/Unknown Vendor/TrueType/LOV.UngerFraktur/LOV.UngerFraktur_Regular.ttf: LOV.UngerFraktur:style=Regular
~/.local/share/fonts/Unknown Vendor/OpenType/LOV.UnicodeFraktur/LOV.UnicodeFraktur_Regular.otf: LOV.UnicodeFraktur:style=Regular

…and…

pango-list
MathJax_Fraktur 
  Regular:      MathJax_Fraktur
  Bold:         MathJax_Fraktur Bold
  *Italic:      MathJax_Fraktur Italic
  *Bold Italic: MathJax_Fraktur Bold Italic
LOV.UngerFraktur 
  Regular:      LOV.UngerFraktur
  *Italic:      LOV.UngerFraktur Italic
  *Bold:        LOV.UngerFraktur Bold
  *Bold Italic: LOV.UngerFraktur Bold Italic
LOV.UnicodeFraktur 
  Regular:      LOV.UnicodeFraktur
  *Italic:      LOV.UnicodeFraktur Italic
  *Bold:        LOV.UnicodeFraktur Bold
  *Bold Italic: LOV.UnicodeFraktur Bold Italic

However, when running create_ground_truth -f LOV.UnicodeFraktur,LOV.UngerFraktur I get no image output, with error messages like

2022-04-04 13:35:55,851 ERROR    subprocess error: Could not find font named 'LOV.UngerFraktur'. Pango suggested font 'FreeMono'. Please correct --font arg.
2022-04-04 13:35:55,852 ERROR    subprocess error: Could not find font named 'LOV.UnicodeFraktur'. Pango suggested font 'FreeMono'. Please correct --font arg.

What am I doing wrong?

(Python 3.8.10, Pango 1.0)

wincentbalin commented 2 years ago

This error comes from the utility text2image, so you should probably output the complete command line. Run the command line in the terminal then and look where text2image looks for the installed fonts.

As a second thought, you might try running the create_ground_truth command with additional option --fonts_dir /usr/share/fonts.

bertsky commented 2 years ago

Ah, interesting. If I run just

text2image --list_available_fonts

it does not show anything. But if I pass --fonts_dir ~/.local/share/fonts/ or --fonts_dir /usr/share/fonts/, then it does show the respective fonts.

Hence, using create_ground_truth with -d does work.

I thought that Tesseract would respect the X font server directly. Perhaps this could be made clear in the README.

Anyway, thanks!

wincentbalin commented 2 years ago

Added the --fonts_dir hint in 507646a.