jsreport / jsreport

javascript based business reporting platform :rocket:
https://jsreport.net
GNU Lesser General Public License v3.0
1.19k stars 231 forks source link

Fonts merge #649

Open pofider opened 5 years ago

pofider commented 5 years ago

The chrome embeds fonts subset based on used characters. If we merge multiple pdfs together the font gets typically duplicated. The merge with "merge whole document" even duplicates font on every page. The size increase with "merge whole document" enabled is marginal so users with pdf size concern should always go with it.

The solution is to merge fonts during pdf merge. I share here some thoughts and we see if we get to it later.

The font in pdf consists of 3 parts.

I've tested that we can keep the cmap and font definition as it is. We just need to merge the font stream and produce one font which contains all characters. Unfortunately, the font created by chrome crashes in the opentype.js and some other font libs.

I found this code to work and it describes kind of the font structure. The missing thing is to take two fonts and put to the merge glyphs from both. http://stevehanov.ca/blog/?id=143

Here is some code for pdfjs which can be used to put the merge font to the pdf.

// get font from page
const descendantFont = resources.get('Font').get('F4').object.properties.get('DescendantFonts')[0].object  
const fontStream = descendantFont.properties.get('FontDescriptor').object.properties.get('FontFile2').object.content
const decompresedFont = zlib.deflateSync(fontStream.content)
// set font in xobject (merged page) the same as original page font
const font = resources.get('Font').get('F4').object.properties.get('DescendantFonts')[0].object.properties.get('FontDescriptor').object.properties.get('FontFile2')
const reference = new PDFReference(font.object)
xobj.page.get('Resources').get('Font').get('F4').object.properties.get('DescendantFonts')[0].object.properties.get('FontDescriptor').object.properties.set('FontFile2', reference)
cotillion commented 5 years ago

We've been encountering this problem while evaluating jsreports.

And its not just fonts which need to be merged but all identical objects (like logos). I did a simple proof of concept a few months ago which just merged all identical objects and that had alot of success in cutting down PDF size.

pofider commented 4 years ago

@cotillion I believe the biggest problem is already fixed in the pdf utils master branch. See here.