Open Snuffleupagus opened 2 years ago
Would this reduce the size of the built-in PDF.js viewer?
Would this reduce the size of the built-in PDF.js viewer?
Difficult to tell without trying, and it might also be possible to compact the Liberation fonts. However, please note that as mentioned above we ship some "duplicated" font-data and the Foxit fonts are also "incomplete".
OK, thanks. It seems worth investigating since on GeckoView size is important.
I need to make some tests to be sure that everything is fine, but my plan is to convert the different Liberations fonts we need (except LiberationSans because we want to be able to rescale it on the fly for XFA) into WOFF2 format. I hope we could limit the total size increase to something between 200 and 300Kb which should acceptable on Android.
Please keep the Node.js, or browsers with disableFontFace = true
set, use-case in mind since we then load and parse the standard-font completely on the worker-thread (including using the FontRenderer
).
@Snuffleupagus, do you have any idea on how disableFontFace = true
is useful:
At least in the Firefox context could you imagine a case where this option is really useful ?
Especially if we remove it what kind of complains could we have ?
I tend to think that almost nobody change this kind of option in about:config
so it should false
for most of the user.
I want to keep the case where we need to fallback on the FontRenderer to draw the glyphs but do you think it's possible to fail when binding a WOFF2 font ?
If it's really necessary to keep the option, in case it's set to true then we could just keep the fontface stuff for the woff2 standard fonts and use the FontRenderer for the other ones, wdyt ? I'd prefer not having such exceptions it's why I'd prefer removing the option.
do you have any idea on how
disableFontFace = true
is useful:
- in the Firefox context ?
- in the browser context ?
As previously mentioned, that option is mostly used in Node.js environments. Hence if you remove that you've essentially made PDF.js significantly less usable in Node.js, which is bound to lead to a never ending stream of user complaints (and it'd probably be an api-major change).
I want to keep the case where we need to fallback on the FontRenderer to draw the glyphs but do you think it's possible to fail when binding a WOFF2 font ?
Please note that all of this requires the font-data to be parse-able by the code in src/core/fonts.js
, possibly with the aid of our Type1/CFF parsers, since I really cannot imagine how it'd work otherwise.
Edit: Note also the "FontFallback"-code, on both the main- and worker-threads, when a font for some reason fails to load (e.g. if it's rejected by the sanitizer).
If the WOFF2 format is sufficiently compatible with TrueType and/or OpenType, you might be able to "extract" usable parts from it such that our existing font-parsing works; see e.g. how the TrueTypeCollection-support is implemented.
Edit: Also, if the user has disabled web-fonts in the browser we fallback to disableFontFace=true
to actually be able to render the document properly; note this code.
Afaik WOFF2 format is different of the usual formats (at least because data are compressed with brotli), so we won't be able to pass a woff2 file to any parsers we've. My idea is just to use it as a possible fallback like here: https://github.com/mozilla/pdf.js/pull/16363/files#diff-84365a5fa783e74d4c83766a21305541dedb5d8ccefa51df9f95fa3a38bf0cfcR159 Can you imagine any drawbacks for this approach ?
Edit: Also, if the user has disabled web-fonts in the browser we fallback to
disableFontFace=true
to actually be able to render the document properly; note this code.
Interesting, I need to think about that.
My idea is just to use it as a possible fallback like here: https://github.com/mozilla/pdf.js/pull/16363/files#diff-84365a5fa783e74d4c83766a21305541dedb5d8ccefa51df9f95fa3a38bf0cfcR159 Can you imagine any drawbacks for this approach ?
As mentioned that simply won't work in either Node.js environments or when we fallback to disableFontFace=true
in the viewer when web-fonts are disabled in the browser. In both of those cases we need to fetch the standard font-data and then parse it just like we'd do with a "normal" embedded PDF font before finally using the FontRenderer
.
Basically, if you want either of those cases to still work then the standard font-data files need to still be parse-able by the "regular" PDF.js font-code as far as I can tell.
Another idea/question: Could we perhaps convert the Liberation fonts to CFF instead? That should be more compact than TrueType, while still retaining the ability to pass them to the regular PDF.js font-code; please also see https://github.com/mozilla/pdf.js/pull/12726#issuecomment-854841277.
There are few tools we can play with to convert the font:
For example with the LiberationSans-Regular.ttf we've in the repo:
makeotf -f LiberationSans-Regular.ttf -o out.otf
fails because of FATAL: cmap{plat=3,script=1,lang=0}: multiple glyphs (uni00A0,dagger) mapped to code <A0>
pyftsubset LiberationSans-Regular.ttf --unicodes="0000-FFFF" --ignore-missing-glyphs --drop-tables-="GPOS","name","GDEF" --layout-scripts="" --no-subset-tables+="FFTM"
Just doing that reduces the size from 139K to126K... it isn't exciting and it's because of hinting code.
So it's possible to strip out hinting in using ttfautohint --dehint LiberationSans-Regular.ttf out.ttf
.
At the end we've something (after ttfautohint, pyftsubset and makeotf) we've a file around 69K.
But now if I run on the file generated by the subsetter (so no makeotf) the woff2 convert (I managed to run it from python shell and fonttools package) I get a file around 30K but if I run the compressor on the otf file then it's around 68K.
If now I run ttfautohint without -dehint (to keep some hinting informations), then run the subsetter and finally compress into woff2 I get a 36K files (and we've hinting data).
So about node.js we can keep ttf fonts (and strip out whatever we want). About Firefox, at least for android (which only have Droid fonts as system fallback) I'd really prefer using woff2 (and if we need, just add a specific check in the Firefox code to allow web-fonts with a pdfjs principal). Potentially we could do the same with desktop. In both cases I need to know if there are any potential drawbacks for adding this specific check. For other browsers we can do the same as in node.js.
I'll ask tuesday (monday is a bank holiday for a lot of people) if we have some telemetry about disabled web-fonts.
(and if we need, just add a specific check in the Firefox code to allow web-fonts with a pdfjs principal). Potentially we could do the same with desktop. In both cases I need to know if there are any potential drawbacks for adding this specific check.
If I were to guess, I'd say that users probably disable web-fonts for security reasons (and I cannot imagine it being common).
There's obviously a question if the fonts embedded in PDF documents should actually be considered "web-fonts" as such, and it might indeed be fine to just allow the built-in Firefox PDF Viewer to always use PDF fonts (regardless of user-settings). Please note: Given the font-parsing code in the PDF.js library we're not actually passing the PDF fonts as-is to the browser, and the fonts are also sanitized (with OTS) in the browser before being passed on to the OS.
As seen in https://github.com/mozilla/pdf.js/pull/12726#issuecomment-853375348 the Foxit fonts are not "complete" enough, which leads to broken rendering in some PDF documents. (This is one reason that we're not unconditionally always loading the standard font data-files, another being that doing so affects performance according to the discussion in that PR.)
Furthermore, as suggested and agreed upon in https://github.com/mozilla/pdf.js/pull/12726#issuecomment-854841277, https://github.com/mozilla/pdf.js/pull/12726#issuecomment-854916134, and https://github.com/mozilla/pdf.js/pull/12726#issuecomment-854975634, the idea was to replace the Foxit fonts (except
FoxitSymbol.pfb
andFoxitDingbats.pfb
) with the more complete Liberation fonts instead.Note that we already ship some Liberation fonts, for use with XFA-documents, and that we thus currently have some duplication in the standard font files. Hence we should replace the
FoxitFixed...
andFoxitSerif...
fonts with their Liberation equivalents./cc @calixteman