mozilla / pdf.js

PDF Reader in JavaScript
https://mozilla.github.io/pdf.js/
Apache License 2.0
48.66k stars 10.01k forks source link

blank pages in pdf [corrupted pdf] #15423

Open 2387skju opened 2 years ago

2387skju commented 2 years ago

Attach (recommended) or Link to PDF file here: https://cdn-reichelt.de/documents/datenblatt/A300/EE24LC01_EE24LC02%23MIC.pdf

Configuration:

What went wrong? (add screenshot) blank pages from page 5 (bottom) until page 10

Compare with other pdf viewer: pdfjs_corrupted-pdf_compare_6c_blackl

Log:

PDF 78dffb9cb7777aafee03c1b199802d3e [1.3 Acrobat Distiller 4.05 for Windows / FrameMaker 6.0] (PDF.js: 3.0.56) app.js:1519:12
Warning: getHexString - ignoring invalid character: 6 2 util.js:425:12
Warning: getHexString - ignoring invalid character: 47 2 util.js:425:12
Warning: getHexString - ignoring invalid character: 84 util.js:425:12
Warning: getHexString - ignoring invalid character: 84 util.js:425:12
Warning: getHexString - ignoring invalid character: 121 util.js:425:12
Warning: getHexString - ignoring invalid character: 121 util.js:425:12
Warning: getHexString - ignoring invalid character: 27 util.js:425:12
Warning: getHexString - ignoring invalid character: 27 util.js:425:12
Warning: getHexString - ignoring additional invalid characters. util.js:425:12
Warning: getHexString - ignoring additional invalid characters. util.js:425:12
Unable to get page for page view 
Object { message: "Bad (uncompressed) XRef entry: 32R", name: "UnknownErrorException", details: "XRefEntryException: Bad (uncompressed) XRef entry: 32R", stack: "BaseExceptionClosure@https://mozilla.github.io/pdf.js/build/pdf.js:540:29\n__webpack_modules__<@https://mozilla.github.io/pdf.js/build/pdf.js:543:2\n__w_pdfjs_require__@https://mozilla.github.io/pdf.js/build/pdf.js:18936:41\n@https://mozilla.github.io/pdf.js/build/pdf.js:19175:32\n@https://mozilla.github.io/pdf.js/build/pdf.js:19226:3\n@https://mozilla.github.io/pdf.js/build/pdf.js:19229:12\nwebpackUniversalModuleDefinition@https://mozilla.github.io/pdf.js/build/pdf.js:31:50\n@https://mozilla.github.io/pdf.js/build/pdf.js:32:3\n" }
pdf_viewer.js:1564:14
Unable to get page for page view 
Object { message: "Bad (uncompressed) XRef entry: 32R", name: "UnknownErrorException", details: "XRefEntryException: Bad (uncompressed) XRef entry: 32R", stack: "BaseExceptionClosure@https://mozilla.github.io/pdf.js/build/pdf.js:540:29\n__webpack_modules__<@https://mozilla.github.io/pdf.js/build/pdf.js:543:2\n__w_pdfjs_require__@https://mozilla.github.io/pdf.js/build/pdf.js:18936:41\n@https://mozilla.github.io/pdf.js/build/pdf.js:19175:32\n@https://mozilla.github.io/pdf.js/build/pdf.js:19226:3\n@https://mozilla.github.io/pdf.js/build/pdf.js:19229:12\nwebpackUniversalModuleDefinition@https://mozilla.github.io/pdf.js/build/pdf.js:31:50\n@https://mozilla.github.io/pdf.js/build/pdf.js:32:3\n" }
pdf_viewer.js:1564:14
Unable to get page for page view 
Object { message: "Bad (uncompressed) XRef entry: 32R", name: "UnknownErrorException", details: "XRefEntryException: Bad (uncompressed) XRef entry: 32R", stack: "BaseExceptionClosure@https://mozilla.github.io/pdf.js/build/pdf.js:540:29\n__webpack_modules__<@https://mozilla.github.io/pdf.js/build/pdf.js:543:2\n__w_pdfjs_require__@https://mozilla.github.io/pdf.js/build/pdf.js:18936:41\n@https://mozilla.github.io/pdf.js/build/pdf.js:19175:32\n@https://mozilla.github.io/pdf.js/build/pdf.js:19226:3\n@https://mozilla.github.io/pdf.js/build/pdf.js:19229:12\nwebpackUniversalModuleDefinition@https://mozilla.github.io/pdf.js/build/pdf.js:31:50\n@https://mozilla.github.io/pdf.js/build/pdf.js:32:3\n" }
pdf_viewer.js:1564:14
renderView: "Error: pdfPage is not loaded" pdf_rendering_queue.js:205:20
Warning: getHexString - ignoring invalid character: 6 2 util.js:425:12
Warning: getHexString - ignoring invalid character: 47 2 util.js:425:12
Warning: getHexString - ignoring invalid character: 84 2 util.js:425:12
Warning: getHexString - ignoring invalid character: 121 2 util.js:425:12
Warning: getHexString - ignoring invalid character: 27 2 util.js:425:12
Warning: getHexString - ignoring additional invalid characters. 2 util.js:425:12
Unable to get page for page view 
Object { message: "Bad (uncompressed) XRef entry: 32R", name: "UnknownErrorException", details: "XRefEntryException: Bad (uncompressed) XRef entry: 32R", stack: "BaseExceptionClosure@https://mozilla.github.io/pdf.js/build/pdf.js:540:29\n__webpack_modules__<@https://mozilla.github.io/pdf.js/build/pdf.js:543:2\n__w_pdfjs_require__@https://mozilla.github.io/pdf.js/build/pdf.js:18936:41\n@https://mozilla.github.io/pdf.js/build/pdf.js:19175:32\n@https://mozilla.github.io/pdf.js/build/pdf.js:19226:3\n@https://mozilla.github.io/pdf.js/build/pdf.js:19229:12\nwebpackUniversalModuleDefinition@https://mozilla.github.io/pdf.js/build/pdf.js:31:50\n@https://mozilla.github.io/pdf.js/build/pdf.js:32:3\n" }
pdf_viewer.js:1564:14
renderView: "Error: pdfPage is not loaded" pdf_rendering_queue.js:205:20
Unable to get page 9 to initialize viewer 
Object { message: "Bad (uncompressed) XRef entry: 32R", name: "UnknownErrorException", details: "XRefEntryException: Bad (uncompressed) XRef entry: 32R", stack: "BaseExceptionClosure@https://mozilla.github.io/pdf.js/build/pdf.js:540:29\n__webpack_modules__<@https://mozilla.github.io/pdf.js/build/pdf.js:543:2\n__w_pdfjs_require__@https://mozilla.github.io/pdf.js/build/pdf.js:18936:41\n@https://mozilla.github.io/pdf.js/build/pdf.js:19175:32\n@https://mozilla.github.io/pdf.js/build/pdf.js:19226:3\n@https://mozilla.github.io/pdf.js/build/pdf.js:19229:12\nwebpackUniversalModuleDefinition@https://mozilla.github.io/pdf.js/build/pdf.js:31:50\n@https://mozilla.github.io/pdf.js/build/pdf.js:32:3\n" }...
Snuffleupagus commented 2 years ago

It's equally broken in Adobe Reader as well, i.e. the PDF reference implementation, hence this is clearly the document itself which is at fault here and not the PDF.js library.

calixteman commented 2 years ago

Here's the qpdf output:

checking MIC.pdf
PDF Version: 1.3
File is not encrypted
File is not linearized
WARNING: MIC.pdf: file is damaged
WARNING: MIC.pdf (object 32 0, offset 71229): expected n n obj
WARNING: MIC.pdf: Attempting to reconstruct cross-reference table
WARNING: MIC.pdf: object 32 0 not found in file after regenerating cross reference table
WARNING: object 32 0: operation for dictionary attempted on object of type null: returning false for a key containment request
WARNING: object 32 0: operation for dictionary attempted on object of type null: returning null for attempted key retrieval
WARNING: MIC.pdf (page tree node, offset 345094): /Type key should be /Page but is not; overriding
WARNING: object 32 0: operation for dictionary attempted on object of type null: ignoring key replacement request
WARNING: object 32 0: operation for dictionary attempted on object of type null: returning null for attempted key retrieval
WARNING: MIC.pdf (object 28 0, offset 70725): expected endstream
WARNING: MIC.pdf (object 28 0, offset 68153): attempting to recover stream length
WARNING: MIC.pdf (object 28 0, offset 68153): recovered stream length: 2621
WARNING: MIC.pdf (object 31 0, offset 70940): unknown token while reading object; treating as string
WARNING: MIC.pdf (object 31 0, offset 70959): unknown token while reading object; treating as string
WARNING: MIC.pdf (object 31 0, offset 70987): unknown token while reading object; treating as string
WARNING: MIC.pdf (object 31 0, offset 71000): unknown token while reading object; treating as string
WARNING: MIC.pdf (object 31 0, offset 71015): unknown token while reading object; treating as string
WARNING: MIC.pdf (object 31 0, offset 71031): unknown token while reading object; treating as string
WARNING: MIC.pdf (object 31 0, offset 71063): treating unexpected brace token as null
WARNING: MIC.pdf (object 31 0, offset 71064): unknown token while reading object; treating as string
WARNING: MIC.pdf (object 31 0, offset 71072): unknown token while reading object; treating as string
WARNING: MIC.pdf (object 31 0, offset 71097): unknown token while reading object; treating as string
WARNING: MIC.pdf (object 31 0, offset 71116): unknown token while reading object; treating as string
WARNING: MIC.pdf (object 31 0, offset 70933): expected dictionary key but found non-name object; inserting key /QPDFFake1
WARNING: MIC.pdf (object 31 0, offset 70933): expected dictionary key but found non-name object; inserting key /QPDFFake2
WARNING: MIC.pdf (object 31 0, offset 70933): expected dictionary key but found non-name object; inserting key /QPDFFake3
WARNING: MIC.pdf (object 31 0, offset 70933): expected dictionary key but found non-name object; inserting key /QPDFFake4
WARNING: MIC.pdf (object 31 0, offset 70933): expected dictionary key but found non-name object; inserting key /QPDFFake5
WARNING: MIC.pdf (object 31 0, offset 70933): expected dictionary key but found non-name object; inserting key /QPDFFake6
WARNING: MIC.pdf (object 31 0, offset 70933): expected dictionary key but found non-name object; inserting key /QPDFFake7
WARNING: MIC.pdf (object 31 0, offset 70933): expected dictionary key but found non-name object; inserting key /QPDFFake8
WARNING: MIC.pdf (object 31 0, offset 70933): expected dictionary key but found non-name object; inserting key /QPDFFake9
WARNING: MIC.pdf (object 31 0, offset 71133): expected endobj
WARNING: page object 23 0 stream 26 0 (content, offset 31506): unexpected )
WARNING: page object 23 0 stream 26 0 (content, offset 31557): unexpected )
WARNING: page object 23 0 stream 26 0 (content, offset 31560): unexpected )
WARNING: page object 23 0 stream 26 0 (content, offset 31590): treating unexpected array close token as null
WARNING: page object 23 0 stream 26 0 (content, offset 31637): unexpected )
WARNING: page object 23 0 stream 26 0 (content, offset 31642): EOF while reading token
WARNING: object 32 0: operation for dictionary attempted on object of type null: returning null for attempted key retrieval
WARNING: page object 48 0 stream 50 0 (content, offset 43982): unexpected )
WARNING: page object 48 0 stream 50 0 (content, offset 44013): unexpected )
WARNING: page object 48 0 stream 50 0 (content, offset 44041): unexpected )
WARNING: page object 48 0 stream 50 0 (content, offset 44141): unexpected )
WARNING: page object 48 0 stream 50 0 (content, offset 44421): unexpected )
WARNING: page object 48 0 stream 50 0 (content, offset 45331): unexpected )
WARNING: page object 48 0 stream 50 0 (content, offset 46027): EOF while reading token
suhaibmujahid commented 2 years ago

I tested the file on multiple viewers:

Screen Shot 2022-09-12 at 1 50 27 PM Screen Shot 2022-09-12 at 1 59 48 PM Screen Shot 2022-09-12 at 1 51 03 PM
GitHubRulesOK commented 2 years ago

Unless there is a historic need for 20 year old data sheets they should be the current one

https://ww1.microchip.com/downloads/aemDocuments/documents/MPD/ProductDocuments/DataSheets/24AA01-24LC01B-24FC01-1K-I2C-Serial-EEPROM-20001711N.pdf https://ww1.microchip.com/downloads/aemDocuments/documents/MPD/ProductDocuments/DataSheets/24AA02-24LC02B-24FC02-2K-I2C-Serial-EEPROM-20001709N.pdf