jsreport / jsreport-pdf-utils

jsreport extension providing pdf operations like merge or concatenation
MIT License
8 stars 4 forks source link

explore the idea of pdf compression #9

Closed bjrmatos closed 5 years ago

bjrmatos commented 6 years ago

seems like in some cases there is a big reduction in size when running pdf compression

BrandonCopley commented 6 years ago

We are in need of pdf compression, we are seeing 10x reduction (20MB to 200KB) when we compress the pdf in adobe.

colinhemphill commented 6 years ago

Thanks for your response and tracking the request. Unfortunately the exact HTML that I'm using for this report is sensitive internal company stuff, but I'll think about if there's a way I can create a sample that's similar.

The largest part of the template is probably the images. There is a relatively small logo image in the PDF, and two JPGs that are user-submitted photos. Each of the photos individually are about 2 MB, so it's definitely not the size of the images themselves that are adding up to a 25 MB output. Like you mentioned, however, it could be metadata embedded in the report.

bjrmatos commented 6 years ago

thanks for details, if you have some html without sensitive information that you can share, that will be great just for comparison. if not, that is ok too. we will discuss this task internally and see how can be implemented.

pofider commented 6 years ago

It would really help if you provide us sample report we can look on. I tried testing some big images but the output pdf size is still fine (the same as image).

colinhemphill commented 6 years ago

I'll see if I can put something together - thanks.

Matzu89 commented 5 years ago

Unfortunately I ran into the same issue. Some PDF's are up to 40MB, which is a little bit too large!

So what is the status of this issue? If I understand correctly we are waiting for some example data with images from @colinhemphill?

bjrmatos commented 5 years ago

@Matzu89 i think it will be nice to see a real PDF example (produced by jsreport) in which the pdf compression is reducing its size a lot. as @BrandonCopley said, they saw 10x reduction in size after the compression, however there is no pdf attached that we can use to test a future implementation. so having a real PDF will help with this.

Matzu89 commented 5 years ago

@bjrmatos I'm certain it's not a jsreport issue and it comes from chrome-pdf and has been a known bug since 2014 (https://bugs.chromium.org/p/chromium/issues/detail?id=414976).

With phantom PDF I get a 3MB file for the same data. It's crazy. I moved away from phantom PDF because of phantom :)

I don't think JSReport should implement a compression solution because of a Chromium bug. Maybe @colinhemphill has found a different solution, at which point I think it's better if documentation points to this solution.

Meanwhile, thinking about moving back to phantom.

bjrmatos commented 5 years ago

@Matzu89 true that it is something the chrome should fix and improve the final size, however we still think that there is place in which we can put a solution if we see real gains, because the chrome issue is open since 2014 so i'm sure it will take a lot of time until that gets resolved (if it ever has the chance to get resolved). maybe pdf compression is still good to have in some other scenarios, but let's see if someone wants to share a good example, in any case this topic looks like something good to explore in future.

msageryd commented 5 years ago

Any progress on this? I'm seeing a big increase in the PDF size for every merged report using pdf-util.

Example: A 30 page pdf with some small images is about 2 MB in size When I merge a 30 page header-report with only the text "test" the pdf grows to about 3 MB.

I think I saw a post from Jan about all the fonts being included once for every merged report. I can't find this post. Maybe removing all unused fonts would be a big saving?

Here is the header that I'm adding. header-footer.csscontains only four css classes. If I remove the string "text" from the below code, the report size goes down to 2 MB again.

<!DOCTYPE html><html>
<head>
    <head>
        <meta charset="utf-8">
        <style>
            {#asset header-footer.css @encoding=utf8}
        </style>
    </head>
    <body>
        {{#each $pdf.pages}}
          {{#if @index}}
            <div style="page-break-before: always;"></div>
          {{/if}}
          test
        {{/each}}
    </body>
</html>
pofider commented 5 years ago

I believe we solved the biggest problem which was pdf object duplication here.

This bug is also a duplicate so I close it, please refer to the https://github.com/jsreport/jsreport/issues/613