Surnet / docker-wkhtmltopdf

wkhtmltopdf for multiple base images
https://hub.docker.com/u/surnet/
MIT License
361 stars 65 forks source link

font-family ignored #1

Closed mikehaertl closed 5 years ago

mikehaertl commented 6 years ago

I've included your wkhtmltopdf binary into one of my Alpine 3.7 based images like this:

COPY --from=surnet/alpine-wkhtmltopdf:3.7-0.12.4-small /bin/wkhtmltopdf /bin/wkhtmltopdf

It works fine so far, except for one nasty issue: It seems to ignore any font-family settings in the generated PDF. Is this a known problem? If not, any idea what I'm doing wrong or what else I could try?

More background:

I'm migrating an older Debian based application with wkhtmltopdf 0.11 to an Alpine based image. The old app works just fine, whereas the new one fails to render the correct font. The relevant part of the CSS is:

font-family: serif;
font-size:      24pt;
font-weight:    bold;
line-height:    20pt;

As I understand it, wkhtmltopdf asks fontconfig to deliver a matching font file. So far I found out, that in the end the old app uses the n021004l.pfb font file. This is the font recommended by fontconfig. The font is also available under the new app and listed there under fc-list. So I used FC_DEBUG=1 to see what wkhtmltopdf asks for and what fontconfig sends back in return.

The output is listed below. The top block seems to be the "query part" (AKA what wkhtmltopdf asks for). And the lower block (Best score ...) is, what fontconfig thinks is the best match for this query.

The weird thing is, that with 0.12.4 it always queries the same list of font families (family: "Helvetica"(s) "TeX Gyre Heros"(s) "Arial"(w)...) , no matter what I set in CSS. Whereas in the old app the query is much more in line with my CSS (family: "Times New Roman"(s) "Tinos"(s)..).

Any feedback is appreciated.

Fontconfig output from the old app

Match Pattern has 21 elts (size 32)                          ] 50%                                                      [810/1838]
        family: "Times New Roman"(s) "Tinos"(s) "Liberation Serif"(s) "Thorndale"(s) "Thorndale AMT"(s) "Tinos"(s) "Liberation Ser
if"(s) "Thorndale"(s) "Thorndale AMT"(s) "Times"(w) "TeX Gyre Termes"(w) "Nimbus Roman No9 L"(w) "TeX Gyre Termes"(w) "Nimbus Roma
n No9 L"(w) "DejaVu Serif"(w) "DejaVu LGC Serif"(w) "DejaVu LGC Serif"(w) "DejaVu Serif"(w) "DejaVu LGC Serif"(w) "Bitstream Vera 
Serif"(w) "DejaVu LGC Serif"(w) "DejaVu Serif"(w) "DejaVu Serif"(w) "Times New Roman"(w) "Thorndale AMT"(w) "Luxi Serif"(w) "Nimbu
s Roman No9 L"(w) "Times"(w) "Artsounk"(w) "BPG UTF8 M"(w) "Kinnari"(w) "Norasi"(w) "Frank Ruehl"(w) "Dror"(w) "JG LaoTimes"(w) "S
aysettha Unicode"(w) "Pigiarniq"(w) "B Davat"(w) "B Compset"(w) "Kacst-Qr"(w) "Urdu Nastaliq Unicode"(w) "Raghindi"(w) "Mukti Narr
ow"(w) "padmaa"(w) "Hapax Berbère"(w) "MS Mincho"(w) "SimSun"(w) "PMingLiu"(w) "WenQuanYi Zen Hei"(w) "WenQuanYi Bitmap Song"(w) "
AR PL ShanHeiSun Uni"(w) "AR PL New Sung"(w) "ZYSong18030"(w) "HanyiSong"(w) "MgOpen Canonica"(w) "Sazanami Mincho"(w) "IPAMonaMin
cho"(w) "IPAMincho"(w) "Kochi Mincho"(w) "AR PL SungtiL GB"(w) "AR PL Mingti2L Big5"(w) "AR PL Zenkai Uni"(w) "MS 明朝"(w) "ZYSo
ng18030"(w) "NanumMyeongjo"(w) "UnBatang"(w) "Baekmuk Batang"(w) "KacstQura"(w) "Frank Ruehl CLM"(w) "Lohit Bengali"(w) "Lohit Guj
arati"(w) "Lohit Hindi"(w) "Lohit Marathi"(w) "Lohit Maithili"(w) "Lohit Kashmiri"(w) "Lohit Konkani"(w) "Lohit Nepali"(w) "Lohit 
Sindhi"(w) "Lohit Punjabi"(w) "Lohit Tamil"(w) "Rachana"(w) "Lohit Malayalam"(w) "Lohit Kannada"(w) "Lohit Telugu"(w) "Lohit Oriya
"(w) "LKLUG"(w) "FreeSerif"(w) "Code2000"(w) "Code2001"(w) "DejaVu Serif"(w) "DejaVu LGC Serif"(w) "Bitstream Vera Serif"(w) "Deja
Vu Serif"(w) "Times New Roman"(w) "Thorndale AMT"(w) "Luxi Serif"(w) "Nimbus Roman No9 L"(w) "Times"(w) "Artsounk"(w) "BPG UTF8 M"
(w) "Kinnari"(w) "Norasi"(w) "Frank Ruehl"(w) "Dror"(w) "JG LaoTimes"(w) "Saysettha Unicode"(w) "Pigiarniq"(w) "B Davat"(w) "B Com
pset"(w) "Kacst-Qr"(w) "Urdu Nastaliq Unicode"(w) "Raghindi"(w) "Mukti Narrow"(w) "padmaa"(w) "Hapax Berbère"(w) "MS Mincho"(w) "S
imSun"(w) "PMingLiu"(w) "WenQuanYi Zen Hei"(w) "WenQuanYi Bitmap Song"(w) "AR PL ShanHeiSun Uni"(w) "AR PL New Sung"(w) "ZYSong180
30"(w) "HanyiSong"(w) "MgOpen Canonica"(w) "Sazanami Mincho"(w) "IPAMonaMincho"(w) "IPAMincho"(w) "Kochi Mincho"(w) "AR PL SungtiL
 GB"(w) "AR PL Mingti2L Big5"(w) "AR PL Zenkai Uni"(w) "MS 明朝"(w) "ZYSong18030"(w) "NanumMyeongjo"(w) "UnBatang"(w) "Baekmuk B
atang"(w) "KacstQura"(w) "Frank Ruehl CLM"(w) "Lohit Bengali"(w) "Lohit Gujarati"(w) "Lohit Hindi"(w) "Lohit Marathi"(w) "Lohit Ma
ithili"(w) "Lohit Kashmiri"(w) "Lohit Konkani"(w) "Lohit Nepali"(w) "Lohit Sindhi"(w) "Lohit Punjabi"(w) "Lohit Tamil"(w) "Rachana
"(w) "Lohit Malayalam"(w) "Lohit Kannada"(w) "Lohit Telugu"(w) "Lohit Oriya"(w) "LKLUG"(w) "FreeSerif"(w) "Code2000"(w) "Code2001"
(w) "serif"(w) "Nazli"(w) "Lotoos"(w) "Mitra"(w) "Ferdosi"(w) "Badr"(w) "Zar"(w) "Nazli"(w) "Lotoos"(w) "Mitra"(w) "Ferdosi"(w) "B
adr"(w) "Zar"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "Helvetica"(w) "TeX Gyre Heros"(w) "Nimbus Sans
 L"(w) "Helvetica"(w) "Times"(w) "Times"(w) "Times New Roman"(w) "Times New Roman"(w) "Times New Roman"(w) "Times New Roman"(w) "A
rial"(w) "Arimo"(w) "Liberation Sans"(w) "Albany"(w) "Albany AMT"(w) "Times New Roman"(w) "Helvetica"(w) "Times"(w) "serif"(w) "se
rif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "
serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "DejaVu Sans"(w) "DejaVu LGC Sans"(w) "DejaVu LGC Sans"(w) "Bitst
ream Vera Sans"(w) "DejaVu Sans"(w) "Verdana"(w) "Arial"(w) "Albany AMT"(w) "Luxi Sans"(w) "Nimbus Sans L"(w) "Helvetica"(w) "Luci
da Sans Unicode"(w) "BPG Glaho International"(w) "Tahoma"(w) "Nachlieli"(w) "Lucida Sans Unicode"(w) "Yudit Unicode"(w) "Kerkis"(w
) "ArmNet Helvetica"(w) "Artsounk"(w) "BPG UTF8 M"(w) "Waree"(w) "Loma"(w) "Garuda"(w) "Umpush"(w) "Saysettha Unicode"(w) "JG Lao 
Old Arial"(w) "GF Zemen Unicode"(w) "Pigiarniq"(w) "B Davat"(w) "B Compset"(w) "Kacst-Qr"(w) "Urdu Nastaliq Unicode"(w) "Raghindi"
(w) "Mukti Narrow"(w) "padmaa"(w) "Hapax Berbère"(w) "MS Gothic"(w) "UmePlus P Gothic"(w) "SimSun"(w) "PMingLiu"(w) "WenQuanYi Zen
 Hei"(w) "WenQuanYi Bitmap Song"(w) "AR PL ShanHeiSun Uni"(w) "AR PL New Sung"(w) "MgOpen Moderna"(w) "MgOpen Modata"(w) "MgOpen C
osmetica"(w) "VL Gothic"(w) "IPAMonaGothic"(w) "IPAGothic"(w) "Sazanami Gothic"(w) "Kochi Gothic"(w) "AR PL KaitiM GB"(w) "AR PL K
aitiM Big5"(w) "AR PL ShanHeiSun Uni"(w) "AR PL SungtiL GB"(w) "AR PL Mingti2L Big5"(w) "MS ゴシック"(w) "ZYSong18030"(w) "Nanum
Gothic"(w) "UnDotum"(w) "Baekmuk Dotum"(w) "Baekmuk Gulim"(w) "KacstQura"(w) "Lohit Bengali"(w) "Lohit Gujarati"(w) "Lohit Hindi"(
w) "Lohit Marathi"(w) "Lohit Maithili"(w) "Lohit Kashmiri"(w) "Lohit Konkani"(w) "Lohit Nepali"(w) "Lohit Sindhi"(w) "Lohit Punjab
i"(w) "Lohit Tamil"(w) "Meera"(w) "Lohit Malayalam"(w) "Lohit Kannada"(w) "Lohit Telugu"(w) "Lohit Oriya"(w) "LKLUG"(w) "FreeSans"
(w) "Arial Unicode MS"(w) "Arial Unicode"(w) "Code2000"(w) "Code2001"(w) "sans-serif"(w) "Roya"(w) "Koodak"(w) "Terafik"(w) "sans-
serif"(w) "DejaVu Sans Mono"(w) "DejaVu LGC Sans Mono"(w) "DejaVu LGC Sans Mono"(w) "Bitstream Vera Sans Mono"(w) "DejaVu Sans Mon
o"(w) "Inconsolata"(w) "Andale Mono"(w) "Courier New"(w) "Cumberland AMT"(w) "Luxi Mono"(w) "Nimbus Mono L"(w) "Courier"(w) "Miria
m Mono"(w) "VL Gothic"(w) "IPAMonaGothic"(w) "IPAGothic"(w) "Sazanami Gothic"(w) "Kochi Gothic"(w) "AR PL KaitiM GB"(w) "MS Gothic
"(w) "UmePlus Gothic"(w) "NSimSun"(w) "MingLiu"(w) "AR PL ShanHeiSun Uni"(w) "AR PL New Sung Mono"(w) "HanyiSong"(w) "AR PL Sungti
L GB"(w) "AR PL Mingti2L Big5"(w) "ZYSong18030"(w) "NanumGothicCoding"(w) "NanumGothic"(w) "UnDotum"(w) "Baekmuk Dotum"(w) "Baekmu
k Gulim"(w) "TlwgTypo"(w) "TlwgTypist"(w) "TlwgTypewriter"(w) "TlwgMono"(w) "Hasida"(w) "Mitra Mono"(w) "GF Zemen Unicode"(w) "Hap
ax Berbère"(w) "Lohit Bengali"(w) "Lohit Gujarati"(w) "Lohit Hindi"(w) "Lohit Marathi"(w) "Lohit Maithili"(w) "Lohit Kashmiri"(w) 
"Lohit Konkani"(w) "Lohit Nepali"(w) "Lohit Sindhi"(w) "Lohit Punjabi"(w) "Lohit Tamil"(w) "Meera"(w) "Lohit Malayalam"(w) "Lohit 
Kannada"(w) "Lohit Telugu"(w) "Lohit Oriya"(w) "LKLUG"(w) "FreeMono"(w) "monospace"(w) "Terafik"(w) "serif"(w) "serif"(w) "serif"(
w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "serif"(w) "sans-serif"(w) "sans-serif"(w) "sans-serif"(w) "sans-serif"(
w) "sans-serif"(w) "sans-serif"(w) "serif"(w) "monospace"(w) "sans-serif"(w) "serif"(w)
        familylang: "en"(s) "en-us"(w)
        stylelang: "en"(s) "en-us"(w)
        fullnamelang: "en"(s) "en-us"(w)
        slant: 0(i)(s)
        weight: 200(i)(s)
        width: 100(i)(s)
        pixelsize: 32(f)(s)
        hintstyle: 3(i)(s)
        hinting: True(s)
        verticallayout: False(s)
        autohint: False(s)
        globaladvance: True(s)
        outline: True(s)
        lang: "en"(w) "en"(w)
        fontversion: 2147483647(i)(s)
        embeddedbitmap: True(s)
        decorative: False(s)
        lcdfilter: 1(i)(w) 1(i)(w)
        namelang: "en"(s)
        prgname: "wkhtmltopdf-amd64"(s)

Best score 0 0 0 0 0 0 1001 0 1 12 0 0 0 0 1 1 1 1 0 0 1 2.14748e+12
Pattern has 17 elts (size 17)
        family: "Nimbus Roman No9 L"(w)
        style: "Medium"(w)
        slant: 0(i)(w)
        weight: 200(i)(w)
        width: 100(i)(w)
        foundry: "urw"(w)
        file: "/usr/share/fonts/type1/gsfonts/n021004l.pfb"(w)
        index: 0(i)(w)
        outline: True(w)
        scalable: True(w)
        charset: 
        0000: 00000000 ffffffff ffffffff 7fffffff 00000000 ffffffff ffffffff ffffffff
        0001: ffffffff ffffffff ffffffff ffffffff 00040000 00000000 00000000 00000000
        0002: 0f000000 00000000 00000000 00000000 00000000 00000000 3f0002c0 00000000
        0003: 00000000 00000000 00000000 00000000 00100000 10000000 00000000 00000000
        0004: ffffffff ffffffff ffffffff 00000000 fffff000 ffffffff ffff199f 033fffff
        0020: 77180000 06010047 00000010 00000000 00000000 00001000 00000000 00000000
        0021: 00400000 00000004 00000000 00000000 00000000 00000000 00000000 00000000
        0022: 46260044 00000000 00000000 00000031 00000000 00000000 00000000 00000000
        0025: 00000000 00000000 00000000 00000000 00000000 00000000 00000400 00000000
        00f6: 00000000 00000000 00000000 00000000 00000000 00000000 000001f8 00000000
        00fb: 00000006 00000000 00000000 00000000 00000000 00000000 00000000 00000000
(w)
        lang: aa|ab|af|av|ay|ba|be|bg|bi|br|bs|bua|ca|ce|ch|chm|co|cs|cv|da|de|en|eo|es|et|eu|fi|fj|fo|fr|fur|fy|gd|gl|gv|ho|hr|hu
|ia|id|ie|ik|io|is|it|kaa|ki|kk|kl|kum|kv|ky|la|lb|lez|lt|lv|mg|mh|mk|mo|mt|nb|nds|nl|nn|no|nr|nso|ny|oc|om|os|pl|pt|rm|ro|ru|sah|
se|sel|sh|sk|sl|sma|smj|smn|so|sq|sr|ss|st|sv|sw|tg|tk|tl|tn|tr|ts|tt|tyv|uk|uz|vo|vot|wa|wen|wo|xh|yap|zu|an|crh|csb|fil|hsb|ht|j
v|kj|ku-tr|kwm|lg|li|mn-mn|ms|na|ng|pap-an|pap-aw|rn|rw|sc|sg|sn|su|za(w)
        fontversion: 0(i)(w)
        fontformat: "Type 1"(w)
        decorative: False(w)
        hash: "sha256:372b66a5816f2c31323ef4b56166a6d89356bda6cc6276fa41532fc2f970807d"(w)
        postscriptname: "NimbusRomNo9L-Medi"(w)

Fontconfig output from the new app

Match Pattern has 22 elts (size 32)                          ] 50%
        family: "Helvetica"(s) "TeX Gyre Heros"(s) "Arial"(w) "Arimo"(w) "Liberation Sans"(w) "Albany"(w) "Albany AMT"(w) "Helvetica"(w) "Bitstream Vera Sans"(w) "DejaVu Sans"(w) "Verdana"(w) "Arial"(w) "Albany AMT"(w) "Luxi Sans"(w) "Nimbus Sans L"(w) "Nimbus Sans"(w) "Helvetica"(w) "Lucida Sans Unicode"(w) "BPG Glaho International"(w) "Tahoma"(w) "Nachlieli"(w) "Lucida Sans Unicode"(w) "Yudit Unicode"(w) "Kerkis"(w) "ArmNet Helvetica"(w) "Artsounk"(w) "BPG UTF8 M"(w) "Waree"(w) "Loma"(w) "Garuda"(w) "Umpush"(w) "Saysettha Unicode"(w) "JG Lao Old Arial"(w) "GF Zemen Unicode"(w) "Pigiarniq"(w) "B Davat"(w) "B Compset"(w) "Kacst-Qr"(w) "Urdu Nastaliq Unicode"(w) "Raghindi"(w) "Mukti Narrow"(w) "malayalam"(w) "Sampige"(w) "padmaa"(w) "Hapax Berbère"(w) "MS Gothic"(w) "UmePlus P Gothic"(w) "Microsoft YaHei"(w) "Microsoft JhengHei"(w) "WenQuanYi Zen Hei"(w) "WenQuanYi Bitmap Song"(w) "AR PL ShanHeiSun Uni"(w) "AR PL New Sung"(w) "MgOpen Modata"(w) "VL Gothic"(w) "IPAMonaGothic"(w) "IPAGothic"(w) "Sazanami Gothic"(w) "Kochi Gothic"(w) "AR PL KaitiM GB"(w) "AR PL KaitiM Big5"(w) "AR PL ShanHeiSun Uni"(w) "AR PL SungtiL GB"(w) "AR PL Mingti2L Big5"(w) "MS ゴシック"(w) "ZYSong18030"(w) "TSCu_Paranar"(w) "NanumGothic"(w) "UnDotum"(w) "Baekmuk Dotum"(w) "Baekmuk Gulim"(w) "KacstQura"(w) "Lohit Bengali"(w) "Lohit Gujarati"(w) "Lohit Hindi"(w) "Lohit Marathi"(w) "Lohit Maithili"(w) "Lohit Kashmiri"(w) "Lohit Konkani"(w) "Lohit Nepali"(w) "Lohit Sindhi"(w) "Lohit Punjabi"(w) "Lohit Tamil"(w) "Meera"(w) "Lohit Malayalam"(w) "Lohit Kannada"(w) "Lohit Telugu"(w) "Lohit Oriya"(w) "LKLUG"(w) "FreeSans"(w) "Arial Unicode MS"(w) "Arial Unicode"(w) "Code2000"(w) "Code2001"(w) "sans-serif"(w) "Roya"(w) "Koodak"(w) "Terafik"(w) "sans-serif"(w) "sans-serif"(w) "sans-serif"(w) "sans-serif"(w) "Helvetica"(w) "Helvetica"(w)
        familylang: "en"(s) "en-us"(w)
        stylelang: "en"(s) "en-us"(w)
        fullnamelang: "en"(s) "en-us"(w)
        slant: 0(i)(s)
        weight: 200(i)(s)
        width: 100(i)(s)
        size: 30.72(f)(s)
        pixelsize: 32(f)(s)
        hintstyle: 1(i)(w)
        hinting: True(s)
        verticallayout: False(s)
        autohint: False(s)
        globaladvance: True(s)
        outline: True(s)
        lang: "en"(w)
        fontversion: 2147483647(i)(s)
        embeddedbitmap: True(s)
        decorative: False(s)
        namelang: "en"(s)
        prgname: "wkhtmltopdf"(s)
        symbol: False(s)

Best score 0 0 0 0 0 0 1000 0 0 14 0 0 0 0 0 0 0 0 0 0 0 0 0 2.14748e+12
Pattern has 19 elts (size 19)
        family: "Nimbus Sans L"(w)
        style: "Bold"(w)
        stylelang: "en"(w) "en"(w)
        slant: 0(i)(w)
        weight: 200(i)(w)
        width: 100(i)(w)
        foundry: "urw"(w)
        file: "/usr/share/fonts/Type1/n019004l.pfb"(w)
        index: 0(i)(w)
        outline: True(w)
        scalable: True(w)
        charset: 
        0000: 00000000 ffffffff ffffffff 7fffffff 00000000 ffffffff ffffffff ffffffff
        0001: ffffffff ffffffff ffffffff ffffffff 00040000 00000000 00000000 00000000
        0002: 0f000000 00000000 00000000 00000000 00000000 00000000 3f0002c0 00000000
        0003: 00000000 00000000 00000000 00000000 00100000 10000000 00000000 00000000
        0004: ffffffff ffffffff ffffffff 00000000 fffff000 ffffffff ffff199f 033fffff
        0020: 77180000 06010047 00000010 00000000 00000000 00001000 00000000 00000000
        0021: 00000000 00000004 00000000 00000000 00000000 00000000 00000000 00000000
        0022: 06260044 00000000 00000000 00000031 00000000 00000000 00000000 00000000
        0025: 00000000 00000000 00000000 00000000 00000000 00000000 00000400 00000000
        00f6: 00000000 00000000 00000000 00000000 00000000 00000000 00000008 00000000
        00fb: 00000006 00000000 00000000 00000000 00000000 00000000 00000000 00000000
(w)
        lang: aa|ab|af|av|ay|ba|be|bg|bi|br|bs|bua|ca|ce|ch|chm|co|cs|cv|da|de|en|eo|es|et|eu|fi|fj|fo|fr|fur|fy|gd|gl|gv|ho|hr|hu|ia|id|ie|ik|io|is|it|kaa|ki|kk|kl|kum|kv|ky|la|lb|lez|lt|lv|mg|mh|mk|mo|mt|nb|nds|nl|nn|no|nr|nso|ny|oc|om|os|pl|pt|rm|ro|ru|sah|se|sel|sh|sk|sl|sma|smj|smn|so|sq|sr|ss|st|sv|sw|tg|tk|tl|tn|tr|ts|tt|tyv|uk|uz|vo|vot|wa|wen|wo|xh|yap|zu|an|crh|csb|fil|hsb|ht|jv|kj|ku-tr|kwm|lg|li|mn-mn|ms|na|ng|pap-an|pap-aw|rn|rw|sc|sg|sn|su|za(w)
        fontversion: 0(i)(w)
        fontformat: "Type 1"(w)
        decorative: False(w)
        postscriptname: "NimbusSanL-Bold"(w)
        color: False(w)
        symbol: False(w)
mikehaertl commented 6 years ago

I now also tried with your base image and the following HTML. The font is rendered as sans-serif. There are serif fonts installed and fontconfig finds them. So I'd say it must have to do with wkhtmltopdf:

<!DOCTYPE html>
<html>
<head>
<style>
p {
    font-family: "Times New Roman", Times, serif;
}
</style>
</head>
<body>
    <div style="font-face: serif; font-size: 12px">
        A little test.
    </div>
    <p>Another paragraph.</p>
</body>
</html>
mikehaertl commented 6 years ago

@chdanielmueller Any idea here? I think this bug is quite essential. If font rendering does not work correctly most generated PDFs will look pretty weird.

chdanielmueller commented 6 years ago

Hi @mikehaertl,

Thank you for bringing this issue to my attention. I currently do not have too much time to address this issue. If you want do dive deeper into it please do so and open a pull request afterwards.

I am sorry that I can not help at the moment.

Regards, Daniel

mikehaertl commented 6 years ago

Ok, thanks for the update. I don't really have time either - but as I'll need a solution soon, maybe I'll be forced to take the time. :smile:

If you have any suggestion or if anything comes to mind where I could start looking, please let me know. I'm completely new to compiling wkhtmltopdf from scratch.

chdanielmueller commented 6 years ago

I took a look at your posted fontconfig outputs. The serif fonts (e.g. Times New Roman) are not listed in the new app.

chdanielmueller commented 6 years ago

fontconfig seems to be working fine

fc-match --verbose "Times New Roman"
Pattern has 35 elts (size 48)
    family: "Liberation Serif"(s)
    familylang: "en"(s)
    style: "Regular"(s)
    stylelang: "en"(s)
    fullname: "Liberation Serif"(s)
    fullnamelang: "en"(s)
    slant: 0(i)(s)
    weight: 80(i)(s)
    width: 100(i)(s)
    size: 12(f)(s)
    pixelsize: 12.5(f)(s)
    foundry: "1ASC"(w)
    hintstyle: 1(i)(w)
    hinting: True(s)
    verticallayout: False(s)
    autohint: False(s)
    globaladvance: True(s)
    file: "/usr/share/fonts/ttf-liberation/LiberationSerif-Regular.ttf"(w)
    index: 0(i)(w)
    outline: True(w)
    scalable: True(w)
    dpi: 75(f)(s)
    scale: 1(f)(s)
    charset: 
    0000: 00000000 ffffffff ffffffff 7fffffff 00000000 ffffffff ffffffff ffffffff
    0001: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
    0002: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
    0003: ffffffff ffffffff ffffffff 7c30ffff ffffd7f0 fffffffb ffff7fff ffffffff
    0004: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
    0005: 3c0fffff 00000000 00000000 00000000 fffe0000 ffffffff ffff00ff 001f07ff
    001d: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff 000007ff c0000000
    001e: ffffffff ffffffff ffffffff ffffffff 4fffffff ffffffff ffffffff 03ffffff
    001f: 3f3fffff ffffffff aaff3f3f 3fffffff ffffffff ffdfffff efcfffdf 7fdcffff
    0020: fffcffff 561dfc47 40000010 81b0fc00 001f0000 003fffff 00000000 00010000
    0021: 00c80020 00004044 78186000 00000000 003f0010 00000100 00000000 00000000
    0022: c6268044 00000a00 00000100 00000033 00000000 00000000 00000000 00000000
    0023: 00010004 00000003 00000000 00000000 00000000 00000000 00000000 00000000
    0025: 11111005 10101010 ffff0000 00001fff 000f1111 14041c03 03009c10 00000040
    0026: 00000000 1c000000 00000005 00008c69 00000000 00000000 00000000 00000000
    002c: 00000000 00000000 00000000 00fe3fff 00000000 00000000 00000000 00000000
    002e: 00800000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    00a7: ff800000 00000003 00000000 00000000 00001f00 00000000 00000000 00000000
    00f0: 00000010 00000000 00000000 00000000 00000000 00000000 00000000 00000000
    00fb: e0000006 5f7fffff 0000ffdb 00000000 00000000 00000000 00000000 00000000
    00fe: 00000000 0000000f 00000000 00000000 00000000 00000000 00000000 00000000
    00ff: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 10000000
(w)
    lang: aa|ab|af|ast|av|ay|az-az|ba|bm|be|bg|bi|bin|br|bs|bua|ca|ce|ch|chm|co|cs|cu|cv|cy|da|de|el|en|eo|es|et|eu|fi|fj|fo|fr|ff|fur|fy|ga|gd|gl|gn|gv|ha|haw|he|ho|hr|hu|ia|ig|id|ie|ik|io|is|it|kaa|ki|kk|kl|ku-am|kum|kv|kw|ky|la|lb|lez|ln|lt|lv|mg|mh|mi|mk|mo|mt|nb|nds|nl|nn|no|nr|nso|ny|oc|om|os|pl|pt|rm|ro|ru|sah|sco|se|sel|sh|shs|sk|sl|sm|sma|smj|smn|sms|so|sq|sr|ss|st|sv|sw|tg|tk|tl|tn|to|tr|ts|tt|tw|tyv|uk|uz|ve|vi|vo|vot|wa|wen|wo|xh|yap|yi|yo|zu|ak|an|ber-dz|crh|csb|ee|fat|fil|hsb|ht|hz|jv|kab|kj|kr|ku-tr|kwm|lg|li|mn-mn|ms|na|ng|nv|pap-an|pap-aw|qu|quz|rn|rw|sc|sg|sn|su|ty|za(s)
    fontversion: 131072(i)(s)
    capability: "otlayout:cyrl otlayout:grek otlayout:hebr otlayout:latn"(w)
    fontformat: "TrueType"(w)
    embeddedbitmap: True(s)
    decorative: False(s)
    namelang: "en"(s)
    prgname: "fc-match"(s)
    postscriptname: "LiberationSerif"(w)
    color: False(w)
    symbol: False(s)
mikehaertl commented 6 years ago

The serif fonts (e.g. Times New Roman) are not listed in the new app.

Yeah, well, that's exactly the problem. So somehow wkhtmltopdf does not ask fontconfig for "Times New Roman" (or any other serif font) even though I've used a serif font in font-family.

The question is: why? It should do so, as you see from the output of the old app (from wkhtmltopdf on Debian).

chdanielmueller commented 6 years ago

Yeah why is a good question. At least I was able to confirm it is a wkhtmltopdf (probably build related) issue. fontconfig itself and the installed fonts are working as expected within the container

mikehaertl commented 6 years ago

Opened an issue with wkthmltopdf. Maybe they have some input. I'm not sure if the project is much maintained lately, though.

chdanielmueller commented 6 years ago

@mikehaertl Please try the command docker run surnet/alpine-wkhtmltopdf:3.7-0.12.4-small nzz.ch - > test.pdf. Somehow the serif fonts from the page nzz.ch are rendered properly.

mikehaertl commented 6 years ago

@chdanielmueller Did you try to set FC_DEBUG=1 before your call? This should give you the same debugging output from fontconfig. Would be interesting to see the differences.

chdanielmueller commented 6 years ago

@mikehaertl I did not manage to set FC_DEBUG=1 and get any output while using wkhtmltopdf. I did get the output while using fc-match. How did you set the environment variable?

mikehaertl commented 6 years ago

I think, nzz.ch uses embedded fonts or something. Maybe this bypasses fontconfig matching.

To get FC_DEBUG output I did this:

$ docker run --rm -ti --entrypoint '' surnet/alpine-wkhtmltopdf:3.7-0.12.4-small sh
/ # FC_DEBUG=1 wkhtmltopdf -q nzz.ch x.pdf
mikehaertl commented 6 years ago

What's striking: The fontconfig matchpattern from wkhtmltopdf for ncc.ch is almost always the same as in my top post. Only some patterns differ:

GT America"(s) "Bitstream Vera Sans"(w) "DejaVu Sans"(w) "Verdana"(w) "Arial"(w) "Albany AMT"(w) "Luxi Sans"(w) "Nimbus Sans L"(w) "Nimbus Sans"(w) ...

I don't know, how wkhtmltopdf creates these match patterns. But what we know:

If you should reverse-engineer this, I would say: wkhtmltopdf has an internal list of all the font names it knows. If it finds a CSS font-family it picks all the font names that somehow match the font-family and creates a match pattern for fontconfig.

And this list is missing some very common font names like "Times New Roman", etc.

Could it be, that it uses whatever fonts are available on the system during compile time?

chdanielmueller commented 6 years ago

Maybe this bypasses fontconfig matching.

This seems to be true. When looking at the CSS I can see font-family: nzz-serif,Georgia;. nzz-serif is propably one of their own fonts which will be downloaded with the browser or in this case with wkhtmltopdf.

wkhtmltopdf has an internal list of all the font names it knows.

This should not be true since I have a buildconfig for qt specifying -fontconfig

Could it be, that it uses whatever fonts are available on the system during compile time?

I suppose not since all fonts are installed together with the build dependencies and before building qt and wkhtmltopdf

I did check the -fontconfig option again on https://doc.qt.io/archives/qtextended4.4/buildsystem/over-configure-options-qt-1.html. It says the following:

Requires fontconfig/fontconfig.h, libfontconfig, freetype.h and libfreetype.

This was not fullfilled during build time... I am now running a qt build with the packages fontconfig-dev and freetype-dev installed. I will keep you updated.

chdanielmueller commented 6 years ago

Had no luck either... Still the same issue. Tested with FC_DEBUG=1 wkhtmltopdf -q https://www.w3schools.com/cssref/css_websafe_fonts.asp /tmp/test.pdf

test.pdf

Again fonts included on the page (fontawesome) were rendered correctly.

chdanielmueller commented 6 years ago

Confirmed the correct rendering of web loaded fonts by using: FC_DEBUG=1 wkhtmltopdf -q https://www.w3schools.com/howto/howto_google_fonts.asp /tmp/test.pdf

test.pdf

mikehaertl commented 6 years ago

@chdanielmueller On a sidenote, maybe you could help creating an "official" Alpine build? They now have a repository to collect the distribution specific build/packaging scripts: https://github.com/wkhtmltopdf/packaging

chdanielmueller commented 6 years ago

@ashkulz Do you have an idea on this?

ashkulz commented 6 years ago

Did you push the version you used with fontconfig-dev present when building? That's the only thing I can think of, to be honest. I'm open to merging in the required Qt patches and creating an official variation for alpine, but would appreciate any help ... I'm kind of stuck in implementing macOS and VS2015 support at the moment.

chdanielmueller commented 6 years ago

@ashkulz No I did not push it because it did not work.

What I found out is that it does not use the installed fontconfig. I moved the fontconfig executables to another directory and wkhtmltopdf was still able to function properly. Do you know where/if the fontconfig used by wkhtmltopdf is reachable? -> for a refresh of the cache.

That seems currently like the most possible issue to me. The qt/wkhtmltopdf internal fontconfig having a wrong cache/config.

chdanielmueller commented 5 years ago

@mikehaertl Don't know if this is still an issue for you. Found a solution which works:

apk add --no-cache --virtual .build-deps msttcorefonts-installer \
&& update-ms-fonts \
&& fc-cache -f \
&& apk del .build-deps

Currently building docker images

mikehaertl commented 5 years ago

@chdanielmueller Very interesting. But did you also see my issue report above? I found that font-family did not work at all for any 0.12.x versions. I wonder why this works here. Maybe you have some input / ideas for the other issue, too?

chdanielmueller commented 5 years ago

@mikehaertl I did see your issue report. I am still a bit confused why my fix works at all. I will post a comment on the issue report as well, as soon as I have deployed the versions with the fix.