danfickle / openhtmltopdf

An HTML to PDF library for the JVM. Based on Flying Saucer and Apache PDF-BOX 2. With SVG image support. Now also with accessible PDF support (WCAG, Section 508, PDF/UA)!
https://danfickle.github.io/pdf-templates/index.html
Other
1.91k stars 356 forks source link

IllegalArgumentException when rendering corrupted image #938

Open florinco opened 1 year ago

florinco commented 1 year ago

In our use case the HTML used for rendering might contain corrupted base64 images. Even this is the case, we still want to render the PDF as the business case requires this. Right now following exception is thrown and PDF rendering is interrupted

Caused by: java.lang.IllegalArgumentException: Last unit does not have enough valid bits at java.base/java.util.Base64$Decoder.decode0(Base64.java:766) at java.base/java.util.Base64$Decoder.decode(Base64.java:538) at java.base/java.util.Base64$Decoder.decode(Base64.java:561) at com.openhtmltopdf.util.ImageUtil.fromBase64Encoded(ImageUtil.java:210) at com.openhtmltopdf.swing.NaiveUserAgent$DataUriFactory.getUrl(NaiveUserAgent.java:173) at com.openhtmltopdf.swing.NaiveUserAgent.openStream(NaiveUserAgent.java:237) at com.openhtmltopdf.pdfboxout.PdfBoxUserAgent.getImageResource(PdfBoxUserAgent.java:74) at com.openhtmltopdf.pdfboxout.PdfBoxReplacedElementFactory.createReplacedElement(PdfBoxReplacedElementFactory.java:90) at com.openhtmltopdf.render.BlockBox.createReplaced(BlockBox.java:781) at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1046) at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1028) at com.openhtmltopdf.layout.InlineBoxing.layoutInlineBlockContent(InlineBoxing.java:596) at com.openhtmltopdf.layout.InlineBoxing.startInlineBlock(InlineBoxing.java:279) at com.openhtmltopdf.layout.InlineBoxing.layoutContent(InlineBoxing.java:242) at com.openhtmltopdf.render.BlockBox.layoutInlineChildren(BlockBox.java:1308) at com.openhtmltopdf.render.BlockBox.layoutChildren(BlockBox.java:1281) at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1085) at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1028) at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild0(BlockBoxing.java:388) at com.openhtmltopdf.layout.BlockBoxing.layoutBlockChild(BlockBoxing.java:366) at com.openhtmltopdf.layout.BlockBoxing.layoutContent(BlockBoxing.java:106) at com.openhtmltopdf.render.BlockBox.layoutChildren(BlockBox.java:1284) at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1085) at com.openhtmltopdf.render.BlockBox.layout(BlockBox.java:1028) at com.openhtmltopdf.pdfboxout.PdfBoxRenderer.layout(PdfBoxRenderer.java:353) at com.openhtmltopdf.pdfboxout.PdfRendererBuilder.run(PdfRendererBuilder.java:45) at com.sobis.jaf.services.output.pdf.PDFGenerator.generate(PDFGenerator.java:203) ... 123 more

Is there a way avoid this?

One solution that I see is to replace the InputStream is = openStream(uriResolved) with a configurable version of the code below InputStream is = null; try { is = openStream(uriResolved); } catch (Exception e) { System.err.println("(1) A corrupted base64 image was found. The image will be ignored.\n" + uriResolved); }

Sample of a corrupt image: <img border="0" id="_x0000_i1025" src="" alt="SomeCorruptImage">

Thanks for your feedback and work!

erickjhorman commented 9 months ago

Awesome! Try using IllegalArgumentException instead of parent class

Kaushik-P-007 commented 4 weeks ago

What is uriResolved in this case?