mozilla / pdf.js

PDF Reader in JavaScript
https://mozilla.github.io/pdf.js/
Apache License 2.0
47.09k stars 9.81k forks source link

Unable to decode image: InvalidStateError #17190

Open srshupe opened 8 months ago

srshupe commented 8 months ago

Attach (recommended) or Link to PDF file here: org_AVA89V01U0$ (Black).pdf

Configuration:

Steps to reproduce the problem:

  1. Open the file in Acrobat. Main area is blacked out but you can see the edges of a CAD drawing along the borders that includes various characters.
  2. Open the file in PDF.js. Only the black rectangle in the center and the blue box in the upper right corner are rendered.

What is the expected behavior? (add screenshot) This is the upper left corner at 200% zoom in Acrobat: image

What went wrong? (add screenshot) The console on Chrome: PDF a8c52134c310cca513061a5a2ce6ca91 [1.3 - / -] (PDF.js: 4.0.132 [34781121c]) util.js:367 Warning: Unable to decode image "img_p0_1": "InvalidStateError: The source image could not be decoded.". util.js:367 Warning: Dependent image isn't ready yet

The console on Firefox: PDF a8c52134c310cca513061a5a2ce6ca91 [1.3 - / -] (PDF.js: 4.0.132 [34781121c]) app.js:1561:12 Warning: Unable to decode image "img_p0_1": "InvalidStateError: An attempt was made to use an object that is not, or is no longer, usable". pdf.worker.mjs:339:13 Warning: Dependent image isn't ready yet 3 util.js:367:12

The console on Edge: PDF a8c52134c310cca513061a5a2ce6ca91 [1.3 - / -] (PDF.js: 4.0.132 [34781121c]) util.js:367 Warning: Unable to decode image "img_p0_1": "InvalidStateError: The source image could not be decoded.". 2util.js:367 Warning: Dependent image isn't ready yet util.js:367 Warning: Dependent image isn't ready yet

This is the upper left corner at 200% zoom in Chrome: image

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension): https://mozilla.github.io/pdf.js/web/viewer.html

Snuffleupagus commented 8 months ago

Most likely the huge image dimensions are the problem, given the following excerpt (taken from http://brendandahl.github.io/pdf.js.utils/browser/):

Resources (dict)
    ProcSet (array) [id: 5, gen: 0]
    XObject (dict)
        Im0 (stream) [id: 6, gen: 0]
            BitsPerComponent = 1
            ColorSpace = /DeviceGray [id: 7, gen: 0]
            DecodeParms (dict)
            Filter = /CCITTFaxDecode
            Height = 19866
            Length = 1110922
            Name = /Im0
            Subtype = /Image
            Type = /XObject
            Width = 28087
            <view contents> download
        wspe_X1 (stream) [id: 11, gen: 0]
            BitsPerComponent = 1
            ColorSpace = /DeviceGray
            Decode (array)
            Filter = /JBIG2Decode
            Height = 19866
            Length = 76004
            Subtype = /Image
            Type = /XObject
            Width = 28087
        wspe_X2 (stream) [id: 10, gen: 0]
        wspe_X3 (stream) [id: 13, gen: 0]
        wspe_X4 (stream) [id: 17, gen: 0]
srshupe commented 7 months ago

Most likely the huge image dimensions are the problem, given the following excerpt (taken from http://brendandahl.github.io/pdf.js.utils/browser/):

So what (roughly) is the limiting factor that keeps the large image from being downloaded and processed? It looks like the promise tasked with fetching the image just quits at some point.

poerlang commented 6 months ago

2GF6`5 D%BOUL$ 4D(44(KA

pdfjs: v4.0.269 or Latest

Microsoft Edge: 120.0.2210.61

abc.pdf

same problem,when run to the await bitmapPromise and createImageBitmap, show err msg:

1、Unable to decode image 2、The source image could not be decoded

usagibear commented 4 months ago

I also get the same error but only for PDFs that are very large (130+ x 50+ in) and may or may not contain multiple page sizes (each page has different dimensions). My PDFs are the default outputs of using the utility TIFF2PDF.

What should we do with files that are too large? Is there a setting we can configure in PDF.js? Thanks!

Specifically, I'm using PDF.js in https://github.com/wojtekmaj/react-pdf v7.7.0.

srshupe commented 4 months ago

I also get the same error but only for PDFs that are very large (130+ x 50+ in) and may or may not contain multiple page sizes (each page has different dimensions). My PDFs are the default outputs of using the utility TIFF2PDF.

What should we do with files that are too large? Is there a setting we can configure in PDF.js? Thanks!

Specifically, I'm using PDF.js in https://github.com/wojtekmaj/react-pdf v7.7.0.

Yes, this is our issue as well. Some further investigation has shown that both the image and the PDF are not very large in terms of memory, only in physical dimensions. Is it a matter of allocating memory for the bitmap within PDF.js?

calixteman commented 4 months ago

As mentioned in https://github.com/mozilla/pdf.js/issues/17190#issuecomment-1781872862 the image size is the issue. After having been decoded it'll require 19866 x 28087 x 4 (RGBA) bytes which is greater than 2**31 - 1 and in Firefox for example the max size of an image is gfx.max-alloc-size (see about:config or https://searchfox.org/mozilla-central/source/modules/libpref/init/StaticPrefList.yaml#6174). We could probably split the image into several ones, resize them and draw the smaller versions on the same resized canvas.