addImage adds garbled data to PDF

wesionaire commented 3 years ago

We try to add the following image to a pdf file:

We need to use a base64 encoding because of CORS issues.

When I try to add it using addImage, I can't open it using Adobe (Insufficient data for an image), in Chrome the image looks garbled and in PDFjs the image is not showing up at all.

Minimal example:

<!DOCTYPE html>
<html style="height: 100%;">
<head>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/jspdf/2.3.1/jspdf.umd.min.js" integrity="sha512-VKjwFVu/mmKGk0Z0BxgDzmn10e590qk3ou/jkmRugAkSTMSIRkd4nEnk+n7r5WBbJquusQEQjBidrBD3IQQISQ==" crossorigin="anonymous"></script>
</head>
<body style="height: 100%">
    <iframe style="width: 100%; height: 100%" id="preview"></iframe>
    <script type="text/javascript">
        const imageBase64 = "data:image/png;base64,...."; // truncated too long for github, find the image here: https://pastebin.com/RA4TFqsz

        const doc = new jspdf.jsPDF();
        doc.addImage(imageBase64, "JPEG", 15, 40, 37, 40.9);

        var iframeElementContainer = document.getElementById('preview');
        iframeElementContainer.src=doc.output("bloburl");
    </script>
</body>
</html>

https://jsfiddle.net/7s3kr6ae/

Of course I could just open the image with Gimp and save it again (it works then), but I would rather have this issue fixed in this library.

HackbrettXXX commented 3 years ago

Does it work if you pass the data as Uint8Array instead?

wesionaire commented 3 years ago

Hi @HackbrettXXX ,

thanks for your reply. No, it still does not work. I converted the base64 string to an Uint8Array and the result is the same:

https://jsfiddle.net/4do73gvm/

Chrome (garbled image):

Firefox (no image at all):

HackbrettXXX commented 3 years ago

Alright, then this seems to be a bug. Maybe when en/de-coding JPEG. Could you try to dig into that?

wesionaire commented 3 years ago

I'm sorry I wrote addImage(xxx, "JPEG") but it's actually a PNG image. It was correct in my project, but I screwed up when creating this example code. I switch to addImage(xxx, "PNG") now, but the result is the same (with uInt8Array and directly as base64 encoding).

//...
        doc.addImage(imageBase64, "PNG", 15, 10, 37, 40.9);
        doc.addImage(_base64ToArrayBuffer(imageBase64), "PNG", 15, 80, 37, 40.9);
//...

https://jsfiddle.net/4do73gvm/2/

When I analyse the image with pngchunks I get the following:

$ pngchunks image.png 
Chunk: Data Length 13 (max 2147483647), Type 1380206665 [IHDR]
  Critical, public, PNG 1.2 compliant, unsafe to copy
  IHDR Width: 370
  IHDR Height: 409
  IHDR Bitdepth: 8
  IHDR Colortype: 2
  IHDR Compression: 0
  IHDR Filter: 0
  IHDR Interlace: 1
  IHDR Compression algorithm is Deflate
  IHDR Filter method is type zero (None, Sub, Up, Average, Paeth)
  IHDR Interlacing method is unknown
  Chunk CRC: 1824660873

I noticed that: IHDR Interlacing method is unknow.

When I open the generated pdf with PDFJs in Firefox I got the following error in the console:

Warning: Unable to decode image: FormatError: Unsupported predictor: 98

Not sure if this helps... If I have time I might try to dig deeper :)

Uzlopak commented 3 years ago

The PNG parser we use seems to get it also correct.

PNG {data: Uint8Array(295274), pos: 295270, palette: Array(0), imgData: Uint8Array(294193), transparency: {…}, …}
animation: null
bits: 8
colorSpace: "DeviceRGB"
colorType: 2
colors: 3
compressionMethod: 0
data: Uint8Array(295274) [137, 80, 78, 71, 13, 10, 26, 10, 0, 0, 0, 13, 73, 72, 68, 82, 0, 0, 1, 114, 0, 0, 1, 153, 8, 2, 0, 0, 1, 108, 194, 29, 137, 0, 0, 0, 9, 112, 72, 89, 115, 0, 0, 11, 18, 0, 0, 11, 18, 1, 210, 221, 126, 252, 0, 0, 2, 13, 122, 84, 88, 116, 88, 77, 76, 58, 99, 111, 109, 46, 97, 100, 111, 98, 101, 46, 120, 109, 112, 0, 0, 40, 145, 125, 83, 205, 142, 155, 48, 16, 126, 21, 203, 123, 173, 177, 13, 152, 63, 45, …]
filterMethod: 0
hasAlphaChannel: false
height: 409
imgData: Uint8Array(294193) [120, 218, 236, 188, 117, 120, 27, 215, 246, 54, 234, 148, 78, 123, 10, 167, 61, 208, 246, 64, 219, 80, 195, 105, 131, 77, 27, 108, 152, 153, 57, 177, 157, 56, 118, 204, 204, 50, 219, 178, 45, 203, 146, 197, 204, 210, 144, 24, 13, 50, 201, 36, 219, 50, 115, 192, 73, 19, 99, 28, 78, 218, 30, 248, 125, 119, 143, 148, 228, 212, 233, 185, 249, 238, 31, 247, 143, 251, 61, 247, 248, 217, 153, 108, 205, 140, 102, 222, 121, 247, 90, 107, 214, 210, 90, 123, 123, 253, 175, 255, …]
interlaceMethod: 1
palette: []
pixelBitlength: 24
pos: 295270
text: {}
transparency: {}
width: 370

Uzlopak commented 3 years ago

It seems that in png_support we miss the deinterlacing when the png is of colorType === 2

I tried multiple attempts to fix it. But I am not sure what i should do.

Pantura commented 3 years ago

Core problem isn't missing deinterlace for type 2, it is run but it fails (for everything?). There are no tests for any interlaced images and I couldn't find the original source for the PNG lib anymore. Author Devon Govett still has GitHub profile but this does not exist (copyright from 2011).

I tried changing the PNG library to https://github.com/photopea/UPNG.js and results look good. There would be some work to do...this was more of a proof of concept. My current running version is stripped of smask, filter and compression.

HackbrettXXX commented 3 years ago

I think replacing the integrated PNG lib with a 3rd party lib is a good idea. UPNG looks good. pdf-lib use it as well and they have a fork on npm.

Uzlopak commented 3 years ago

We could replace png with the solution how we do it for webp or gif, but this would make the pdf much bigger. We have to make sure, that we still have the png specific transformation.

Pantura commented 3 years ago

Would need to run further study if another library could handle both interlacing and compression as there would be quite an increase in file size as Uzlopak mentions. Couldn't dig deeper yet.

rpep commented 3 years ago

Also ran into the same problem

parallax / jsPDF

addImage adds garbled data to PDF #3126