Closed mikehaertl closed 5 years ago
I now also tried with your base image and the following HTML. The font is rendered as sans-serif. There are serif fonts installed and fontconfig finds them. So I'd say it must have to do with wkhtmltopdf:
<!DOCTYPE html>
<html>
<head>
<style>
p {
font-family: "Times New Roman", Times, serif;
}
</style>
</head>
<body>
<div style="font-face: serif; font-size: 12px">
A little test.
</div>
<p>Another paragraph.</p>
</body>
</html>
@chdanielmueller Any idea here? I think this bug is quite essential. If font rendering does not work correctly most generated PDFs will look pretty weird.
Hi @mikehaertl,
Thank you for bringing this issue to my attention. I currently do not have too much time to address this issue. If you want do dive deeper into it please do so and open a pull request afterwards.
I am sorry that I can not help at the moment.
Regards, Daniel
Ok, thanks for the update. I don't really have time either - but as I'll need a solution soon, maybe I'll be forced to take the time. :smile:
If you have any suggestion or if anything comes to mind where I could start looking, please let me know. I'm completely new to compiling wkhtmltopdf from scratch.
I took a look at your posted fontconfig outputs. The serif fonts (e.g. Times New Roman) are not listed in the new app.
fontconfig seems to be working fine
fc-match --verbose "Times New Roman"
Pattern has 35 elts (size 48)
family: "Liberation Serif"(s)
familylang: "en"(s)
style: "Regular"(s)
stylelang: "en"(s)
fullname: "Liberation Serif"(s)
fullnamelang: "en"(s)
slant: 0(i)(s)
weight: 80(i)(s)
width: 100(i)(s)
size: 12(f)(s)
pixelsize: 12.5(f)(s)
foundry: "1ASC"(w)
hintstyle: 1(i)(w)
hinting: True(s)
verticallayout: False(s)
autohint: False(s)
globaladvance: True(s)
file: "/usr/share/fonts/ttf-liberation/LiberationSerif-Regular.ttf"(w)
index: 0(i)(w)
outline: True(w)
scalable: True(w)
dpi: 75(f)(s)
scale: 1(f)(s)
charset:
0000: 00000000 ffffffff ffffffff 7fffffff 00000000 ffffffff ffffffff ffffffff
0001: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
0002: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
0003: ffffffff ffffffff ffffffff 7c30ffff ffffd7f0 fffffffb ffff7fff ffffffff
0004: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
0005: 3c0fffff 00000000 00000000 00000000 fffe0000 ffffffff ffff00ff 001f07ff
001d: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff 000007ff c0000000
001e: ffffffff ffffffff ffffffff ffffffff 4fffffff ffffffff ffffffff 03ffffff
001f: 3f3fffff ffffffff aaff3f3f 3fffffff ffffffff ffdfffff efcfffdf 7fdcffff
0020: fffcffff 561dfc47 40000010 81b0fc00 001f0000 003fffff 00000000 00010000
0021: 00c80020 00004044 78186000 00000000 003f0010 00000100 00000000 00000000
0022: c6268044 00000a00 00000100 00000033 00000000 00000000 00000000 00000000
0023: 00010004 00000003 00000000 00000000 00000000 00000000 00000000 00000000
0025: 11111005 10101010 ffff0000 00001fff 000f1111 14041c03 03009c10 00000040
0026: 00000000 1c000000 00000005 00008c69 00000000 00000000 00000000 00000000
002c: 00000000 00000000 00000000 00fe3fff 00000000 00000000 00000000 00000000
002e: 00800000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00a7: ff800000 00000003 00000000 00000000 00001f00 00000000 00000000 00000000
00f0: 00000010 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00fb: e0000006 5f7fffff 0000ffdb 00000000 00000000 00000000 00000000 00000000
00fe: 00000000 0000000f 00000000 00000000 00000000 00000000 00000000 00000000
00ff: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 10000000
(w)
lang: aa|ab|af|ast|av|ay|az-az|ba|bm|be|bg|bi|bin|br|bs|bua|ca|ce|ch|chm|co|cs|cu|cv|cy|da|de|el|en|eo|es|et|eu|fi|fj|fo|fr|ff|fur|fy|ga|gd|gl|gn|gv|ha|haw|he|ho|hr|hu|ia|ig|id|ie|ik|io|is|it|kaa|ki|kk|kl|ku-am|kum|kv|kw|ky|la|lb|lez|ln|lt|lv|mg|mh|mi|mk|mo|mt|nb|nds|nl|nn|no|nr|nso|ny|oc|om|os|pl|pt|rm|ro|ru|sah|sco|se|sel|sh|shs|sk|sl|sm|sma|smj|smn|sms|so|sq|sr|ss|st|sv|sw|tg|tk|tl|tn|to|tr|ts|tt|tw|tyv|uk|uz|ve|vi|vo|vot|wa|wen|wo|xh|yap|yi|yo|zu|ak|an|ber-dz|crh|csb|ee|fat|fil|hsb|ht|hz|jv|kab|kj|kr|ku-tr|kwm|lg|li|mn-mn|ms|na|ng|nv|pap-an|pap-aw|qu|quz|rn|rw|sc|sg|sn|su|ty|za(s)
fontversion: 131072(i)(s)
capability: "otlayout:cyrl otlayout:grek otlayout:hebr otlayout:latn"(w)
fontformat: "TrueType"(w)
embeddedbitmap: True(s)
decorative: False(s)
namelang: "en"(s)
prgname: "fc-match"(s)
postscriptname: "LiberationSerif"(w)
color: False(w)
symbol: False(s)
The serif fonts (e.g. Times New Roman) are not listed in the new app.
Yeah, well, that's exactly the problem. So somehow wkhtmltopdf does not ask fontconfig for "Times New Roman" (or any other serif font) even though I've used a serif font in font-family
.
The question is: why? It should do so, as you see from the output of the old app (from wkhtmltopdf on Debian).
Yeah why is a good question. At least I was able to confirm it is a wkhtmltopdf (probably build related) issue. fontconfig itself and the installed fonts are working as expected within the container
Opened an issue with wkthmltopdf. Maybe they have some input. I'm not sure if the project is much maintained lately, though.
@mikehaertl Please try the command docker run surnet/alpine-wkhtmltopdf:3.7-0.12.4-small nzz.ch - > test.pdf
.
Somehow the serif fonts from the page nzz.ch are rendered properly.
@chdanielmueller Did you try to set FC_DEBUG=1
before your call? This should give you the same debugging output from fontconfig. Would be interesting to see the differences.
@mikehaertl I did not manage to set FC_DEBUG=1
and get any output while using wkhtmltopdf. I did get the output while using fc-match. How did you set the environment variable?
I think, nzz.ch uses embedded fonts or something. Maybe this bypasses fontconfig matching.
To get FC_DEBUG
output I did this:
$ docker run --rm -ti --entrypoint '' surnet/alpine-wkhtmltopdf:3.7-0.12.4-small sh
/ # FC_DEBUG=1 wkhtmltopdf -q nzz.ch x.pdf
What's striking: The fontconfig matchpattern from wkhtmltopdf for ncc.ch is almost always the same as in my top post. Only some patterns differ:
GT America"(s) "Bitstream Vera Sans"(w) "DejaVu Sans"(w) "Verdana"(w) "Arial"(w) "Albany AMT"(w) "Luxi Sans"(w) "Nimbus Sans L"(w) "Nimbus Sans"(w) ...
I don't know, how wkhtmltopdf creates these match patterns. But what we know:
If you should reverse-engineer this, I would say: wkhtmltopdf has an internal list of all the font names it knows. If it finds a CSS font-family
it picks all the font names that somehow match the font-family
and creates a match pattern for fontconfig.
And this list is missing some very common font names like "Times New Roman", etc.
Could it be, that it uses whatever fonts are available on the system during compile time?
Maybe this bypasses fontconfig matching.
This seems to be true.
When looking at the CSS I can see font-family: nzz-serif,Georgia;
.
nzz-serif
is propably one of their own fonts which will be downloaded with the browser or in this case with wkhtmltopdf.
wkhtmltopdf has an internal list of all the font names it knows.
This should not be true since I have a buildconfig for qt specifying -fontconfig
Could it be, that it uses whatever fonts are available on the system during compile time?
I suppose not since all fonts are installed together with the build dependencies and before building qt and wkhtmltopdf
I did check the -fontconfig
option again on https://doc.qt.io/archives/qtextended4.4/buildsystem/over-configure-options-qt-1.html.
It says the following:
Requires fontconfig/fontconfig.h, libfontconfig, freetype.h and libfreetype.
This was not fullfilled during build time...
I am now running a qt build with the packages fontconfig-dev
and freetype-dev
installed.
I will keep you updated.
Had no luck either... Still the same issue.
Tested with FC_DEBUG=1 wkhtmltopdf -q https://www.w3schools.com/cssref/css_websafe_fonts.asp /tmp/test.pdf
Again fonts included on the page (fontawesome) were rendered correctly.
Confirmed the correct rendering of web loaded fonts by using: FC_DEBUG=1 wkhtmltopdf -q https://www.w3schools.com/howto/howto_google_fonts.asp /tmp/test.pdf
@chdanielmueller On a sidenote, maybe you could help creating an "official" Alpine build? They now have a repository to collect the distribution specific build/packaging scripts: https://github.com/wkhtmltopdf/packaging
@ashkulz Do you have an idea on this?
Did you push the version you used with fontconfig-dev
present when building? That's the only thing I can think of, to be honest. I'm open to merging in the required Qt patches and creating an official variation for alpine, but would appreciate any help ... I'm kind of stuck in implementing macOS and VS2015 support at the moment.
@ashkulz No I did not push it because it did not work.
What I found out is that it does not use the installed fontconfig. I moved the fontconfig executables to another directory and wkhtmltopdf was still able to function properly. Do you know where/if the fontconfig used by wkhtmltopdf is reachable? -> for a refresh of the cache.
That seems currently like the most possible issue to me. The qt/wkhtmltopdf internal fontconfig having a wrong cache/config.
@mikehaertl Don't know if this is still an issue for you. Found a solution which works:
apk add --no-cache --virtual .build-deps msttcorefonts-installer \
&& update-ms-fonts \
&& fc-cache -f \
&& apk del .build-deps
Currently building docker images
@chdanielmueller Very interesting. But did you also see my issue report above? I found that font-family
did not work at all for any 0.12.x versions. I wonder why this works here. Maybe you have some input / ideas for the other issue, too?
@mikehaertl I did see your issue report. I am still a bit confused why my fix works at all. I will post a comment on the issue report as well, as soon as I have deployed the versions with the fix.
I've included your
wkhtmltopdf
binary into one of my Alpine 3.7 based images like this:It works fine so far, except for one nasty issue: It seems to ignore any
font-family
settings in the generated PDF. Is this a known problem? If not, any idea what I'm doing wrong or what else I could try?More background:
I'm migrating an older Debian based application with wkhtmltopdf 0.11 to an Alpine based image. The old app works just fine, whereas the new one fails to render the correct font. The relevant part of the CSS is:
As I understand it, wkhtmltopdf asks fontconfig to deliver a matching font file. So far I found out, that in the end the old app uses the
n021004l.pfb
font file. This is the font recommended by fontconfig. The font is also available under the new app and listed there underfc-list
. So I used FC_DEBUG=1 to see what wkhtmltopdf asks for and what fontconfig sends back in return.The output is listed below. The top block seems to be the "query part" (AKA what wkhtmltopdf asks for). And the lower block (Best score ...) is, what fontconfig thinks is the best match for this query.
The weird thing is, that with 0.12.4 it always queries the same list of font families (family: "Helvetica"(s) "TeX Gyre Heros"(s) "Arial"(w)...) , no matter what I set in CSS. Whereas in the old app the query is much more in line with my CSS (family: "Times New Roman"(s) "Tinos"(s)..).
Any feedback is appreciated.
Fontconfig output from the old app
Fontconfig output from the new app