boazsegev / combine_pdf

A Pure ruby library to merge PDF files, number pages and maybe more...
MIT License
734 stars 156 forks source link

Zlib::DataError Exception: incorrect header check #155

Open dskim opened 5 years ago

dskim commented 5 years ago

Hi,

We've just encountered this issue with an pdf that we're attaching to our pdf generation process.

I'm not too sure the cause of this other than the fact that Zlib doesn't seem to like the format of this content.

Here is the stacktrace.

File lib/ruby/gems/2.2.0/gems/combine_pdf-1.0.16/lib/combine_pdf/filter.rb line 69 in inflate
File lib/ruby/gems/2.2.0/gems/combine_pdf-1.0.16/lib/combine_pdf/filter.rb line 69 in inflate_object
File lib/ruby/gems/2.2.0/gems/combine_pdf-1.0.16/lib/combine_pdf/parser.rb line 114 in block in parse
File lib/ruby/gems/2.2.0/gems/combine_pdf-1.0.16/lib/combine_pdf/parser.rb line 110 in times
File lib/ruby/gems/2.2.0/gems/combine_pdf-1.0.16/lib/combine_pdf/parser.rb line 110 in parse
File lib/ruby/gems/2.2.0/gems/combine_pdf-1.0.16/lib/combine_pdf/pdf_public.rb line 98 in initialize
File lib/ruby/gems/2.2.0/gems/combine_pdf-1.0.16/lib/combine_pdf/api.rb line 11 in new
File lib/ruby/gems/2.2.0/gems/combine_pdf-1.0.16/lib/combine_pdf/api.rb line 11 in load

And here is the inspected value of object in line 69.

{:Filter=>[:FlateDecode], :First=>143, :Length=>1064, :N=>17, :Type=>:ObjStm, :raw_stream_content=>"\x87\xAE\xAB;\t4\xA8\xBCLbP\x17\xF1g9#\xAE[\xEBJ\x11$\x8F\xEF\xCC\tB\x13\xF4t\x8A\xE6\x89p V\xDF\x885\xEBB;\x13\x05\xC1\xEDf3qL\xB5H\xC5\x8C\xBC\xE0\x04\xBC\xF9\xC9p`O&\x88\t3|\xFD\xAC1\xDE\x16@\xEC\xA4\x1DYs\x90\xB1\x13\xB8\xC3\xA9\xA0\xB4\xB9\xE3\xCE\"\x81\xBA\xDC'\x0E\xFF99\x87I\xAD\xEB}\x95\xEF\x85%uk\xC1}l;F\x83\xBC\xF0<\f\xDA\xF8!y\xD1\xB0\xF4\x93\x03v\x1E\xE0\xAF\xFA%\x97e\x03\x93\xED\x16\xBE\x8A?\x01\x82\r\x91\x1DE(@\xFE\xE5\xBF9\xBFp\xA5\x1F\x159\x91_\x88\b\x19\x8E\x03\x97\xA2\x14c\xAC\xE9\xE2x\\\xE1\xA1\x85\x0F{#3\x83\x8C\x88O\xD3\"X9\xE6\xD3\xAC\x1F\x7FD\xBE\xE7Fm\x81\xC0*\xBF\xD8\xDD)\\T_\xE2\xE7v\xC8v\xE4Z\x9229\x14\fG\x06\xDF\xB8=\xCD\xDF\xFBars\x82Y\xEC\xE3O\e\xF8\xEE\x14\x8AU\xA9\xE9\xDCI\xE0\x9Cc\x02b\xC1\xA6 \xA7G\x05e\xF4\xD0a\x15\xCEE\x85\x84\xCD\xE0\xA4\xB4\x9A\xC3\xDB8 4\\5u\x02U0L\x04\xB8\xDB\xDDg\x90\x9D\x9A>\x17\xBA]\xA9\x04\xBB\x06\e\xE3\xCA\xDE$\xCF\xD6\xA1\xDE\xF7B?Gm\xE2]\xC8D)\xEB\x15h\xBA{\xD0\xC1\xB3\xD0F\xB4fn\xC4\xC3\x02p\xEE\xB6\xFAjD\xE3N\x96\x87\xD8\xB4\xFA]\xB2\xC4\xAAj\xE4\e\xF5\x1F\"<h%*\xD5\xCF eW\xBB\xCE\x12\xC5\x8FfpX\x98Eq\xC7V\x19\xBB\x1D\x80A\x87S+\x8E\xE0\xC1O\xDCK\xB3L}\x89\xD5\x91d\x9E\xFCM\xB8\x17!\xA9\xC3 \xBC\r\xA5\x84\x9C\xAC\xC73\xD7c_\x96\xA9 \x7F\xBA\x97\xB2\x1A\xA5b4\x00RK}Q\xD4\x0F\xFD\xB7\xEAL\xF8/\x84gZE\x90\xAD\x8ETWU\xE6\x81\xC5\xACJ\x02\xE7\x86Z\xF6\xDF\xF7#[0\xB9G\xEB?\xEC\xBF\xA2\x91\x03\xBE_\xCFO\xFA\xCC8\xA3\xF9\x9F\x8B\x06\xB0\xEE\xA8\x00\e\xDBe\x11\xF7\xB5f\xAE\xD2\xBE\xE6\"[\xC3\x8F\xB1\x00ks\xCE\x8D\xE3 \xC2\xDD\xAFh\xB3\xCB\xBDz\xE6@\xC1\\\xCA\x7F\xEB7O\xB3x\x98i\xB1\x85\xC5+G#\e\xAD\x03\xD8\r\xE7\xCBv\x80\xA7\xF9\xECj\x17>\xF7\x99\xA9\x84\xA8\x05&g\xD70l\a\xD4~\x91%\xF5X;\xE3\xA2\xF9C\xE9\x19\xFEH\x83\x83\xB7\xD1\b\x86\x05\xCB\x0E\x04\xDB\x98!\x84\xFE\x7F\xF6&A\x00A\x1E\xD7\xCD\x1E\x0F7O\xCE\x91\xFE~\xFF\xE8\x82]\x18\x82Y\xEB'b\xAC\x87\xB3)\x1DV\x93\xBC\x8B\xDCx\xFA\xFF(\xDAI.&\xA3\xF8\x10\x9B\xB4\xCB\xD1a<\xB28\xA0O\xF9\x12D\xE49\xEF\x05\x13s\xE3\x06N\xB0\xD8\xDD\x19\x12\x99/\xCD\x9F%~3\xB2K\x8CY\x1D_Y\x89\xC2e\xBCZ\x7F\x8AG\xE5F\x0F\xB0\xD0\xFF\xC8\x98\t\xCB\x94\x85\xE7\x8Ck\x83\xF3J\xE7\x95o\xE0l\xF4P\v\x8Ee2\t\xE1'\x83\xB5BJ\x01]x\xEE5\xB7F\xDA\xA1\x80\xE7=\xD0\xCD\xBB\xF8\xCA\xF8\x893\xD6.\xBC\xD1\x14\x1FW\x9E\n4\x93O\xC6\"\x10\x13\x8D<p\xF0\xE4\xCB\xED\xFB\xF5 S\xE5\x91\xA6\x87^\e\x15F\xCAL\xABh\xF80\x94\r\xF9\xBEA\xBC\xDF\x02\x06\xFF\xEE\xD7\xB9\xAD\\\x82f\xF8\xE6},A\xBC\xDEw\x95z\xC0\xEB\b\xB4\xEB\xC5\x1E9\xE0e\v\x91{\x9A)\xBE\x9D+\x16Ld\xB3\x80\b\xD7\xE0\xA8\xE0N\a\x15{\xCB6$z\xC0\xED/\x96?\x10\x14:\xE2t\xCFL!b3\xBA\xD2\xA3@m7\xDD\xDD4\x91\xB82\x04\x8C\xB3\x1F@\xBDf.$<R\xE9\xF9\xA4s}\xE0\x19\xA8]\xE4\v\xDB\xA0\x1D\xE8s/\x10\xFC\xA8<(\xA5\x8D\xB6\xD9q\x98k\n\x1D\x92O6\x1E\x82\x90f\x81\xEEx}\\\xEC\e>|\xFA\xE9\x03\xDC:~\xAF\xB1\x9B\xF2\xE2\x89\xE9\xD6\xC4+\e+\xD8J\xBD\x90\xAE\xD7Ei\f2\xBFW@9\x0E2`\xD3\xF1\xDC\x15H\xD8\x9A\xBF\t\xD3~|7\xAA7y2\x96\xF3\xA1;$\xF1\xCDP\x14\xE1\x00\xE9\xC6\xE6\xA0\xDE\xE4\xFF\x8D\x9C\xD5}\x0E_\xB7\xF9\r\xA8cyB\x85\xFB\xE2\x99\x92\x8A\x8E\xBEI\x00V\xD2\x98\x820\x8B\xBA\x01\xD9\xF4\x8Ax\xB5\xD4\xA7\xEC';\xC8#\xB7\xA3\x81", :indirect_generation_number=>0, :indirect_reference_id=>109, :DecodeParms=>[nil]}

Please let me know if you need anymore info from me. Thanks a lot for your help.

boazsegev commented 5 years ago

Hi @dskim ,

Could you send me a PDF, so I could reproduce the issue and test solutions?

Thanks, Bo.

P.S.

Workaround: for now, it should be possible to work around PDF issues by opening the PDF and reserving it / printing it to a new PDF. Usually, OS services (like macOS Preview and PDF printers on Windows) produce friendlier PDF files than most PDF authoring applications (like Adobe).

dskim commented 5 years ago

Hi @boazsegev,

Sorry for the late reply. It's got some possibly sensitive information so I can't share it for now.

We've avoided using that PDF to get around for now, but we'll try your workaround next time. Please feel free to close it if you can't do much without the original pdf. I'll try send you a copy if we encounter another one down the track.

Thanks a lot for your help.

Regards, David

Defoncesko commented 3 years ago

Hey there,

Thanks for your awesome gem.

I'm having the same error message. Here is the file : https://subclic.s3-eu-west-1.amazonaws.com/github/test-zlib.pdf

abbaing commented 2 years ago

Hello, any update on this? I'm having the same error

jfr commented 1 year ago

Don't know if this helps in your case, but here's our workaround: we rescue the error in the combine process, generate an error page with prawn and add it to the combined result for the user to figure out or manually download the malformed PDF.

rescue Zlib::DataError error_page = Prawn::Document.new do text "ERROR: Document skipped (Zlib::DataError)" stroke_horizontal_rule pad(10) { text "The following document was skipped due to an error: #{File.basename(file.path)}", size: 10 } end pdf = CombinePDF.parse(error_page.render) end