Append completes with empty pages on some existing pdf files

jsreport / jsreport-pdf-utils

jsreport extension providing pdf operations like merge or concatenation

MIT License

8 stars 4 forks source link

Append completes with empty pages on some existing pdf files #32

Open lichutin-st opened 3 years ago

lichutin-st commented 3 years ago

I use pdf-utils to merge ChromePdf templates with some existing pdf files using the StaticPdf engine.

On some pdf files (e.g. https://palicfilmfestival.com/uploads/documents/20200709/document_487851581.pdf ) Merge-Append finishes with empty pages instead of real content (only StaticPdf template produces empty pages, ChromePdf works fine). Behavior is the same for either ChromePdf + StaticPdf or StaticPdf + StaticPdf merging.

I thought it's an issue of the StaticPdf engine but rendering just one StaticPdf template without any pdf utils operations works fine, so I decided to describe a situation here.

Most of the existing pdf files also don't cause such strange effects.

lichutin-st commented 3 years ago

@pofider sorry for bothering you, I'm not sure whether anybody received notifications about issues in that repo or not.

pofider commented 3 years ago

Sorry for the delay. The pdf you linked is protected with a password and that is likely the problem.

Note the pdf utils don't work with all external pdfs. The specification is too wide and the extension doesn't work with all the glitches and features that can be found there. However, we appreciate when you open an issue with pdf that is not working so we can once fix it.

lichutin-st commented 3 years ago

Thanks for the answer! Yeah, I understand that PDF is a very complex thing, but I was very curious what is the problem with that file and what can we do to avoid such behavior.

Also, maybe we could have an option to throw an error when we can't extract content from the external file.

BTW, is there an ability to use JsReport to check whether a file is protected with a password?

pofider commented 3 years ago

Also, maybe we could have an option to throw an error when we can't extract content from the external file.

Yes sure, I don't know what's happening there without debugging now,

BTW, is there an ability to use JsReport to check whether a file is protected with a password?

You can see the password is protected in the acrobat reader.
I personally use this tool to analyze pdfs https://github.com/itext/i7j-rups/releases

lichutin-st commented 3 years ago

You can see the password is protected in the acrobat reader.

Can be a bit tricky if somebody uses a pdf file from user input 😄

pofider commented 3 years ago

Ah I see, that is bad.

jsreport primarily doesn't support the linearized pdfs in the pdf utils operations. I believe that is most of the cases that go wrong. It's also a case of the pdf you linked. It should be easy to signal this with a code like this.

if (pdfBuffer.toString().includes("/Linearized")) {
  throw new Error('Not supported pdf')
}

Maybe you can apply such a check-in your code where you accept the pdf from the user?

We want to re-implement the underlying pdf library to make it more robust and support the specs "fully". But it is a big thing which we plan for the next year.

lichutin-st commented 3 years ago

pdfBuffer.toString().includes("/Linearized")

Nice idea, that could help, thanks!

lichutin-st commented 2 years ago

Hello @pofider , are there any news about plans to support linearized pdf documents?