Hopding / pdf-lib

Create and modify PDF documents in any JavaScript environment
https://pdf-lib.js.org
MIT License
6.74k stars 645 forks source link

Some pages (PageLeaf) missing once the pdf is loaded #1520

Open jfvanin opened 11 months ago

jfvanin commented 11 months ago

What were you trying to do?

Trying to copy pages from a file with 8 pages given a list of specific indexes

How did you attempt to do it?

Using function .copyPages and filling the indeces parameter with the last 7 positions of the file [1,2,3,4,5,6,7]

What actually happened?

The function copyPages failed. The reason for that was that my last index in the inidices parameter couldn't be found in the loaded file, even though it was there. The reason why that couldn't be found was that the first page of the document somehow was not loaded, so the second page of the document was in the position 0 of the pages array.

What did you expect to happen?

The expected scenario was that the PDFDocument.load function would load all the 8 pages of the file, but it loads just the last 7 pages, A pageLeaf for the first page is missing.

How can we reproduce the issue?

Unfortunately I can't share the PDF due to privacy rules, and if I try to update the document file to anonymise it the problem in the PDF goes away. By its structure it seems that it is a PDF which was originally 2 distinct files being the 1st page a document and the rest another document.

I'm sorry if that's not enough for a bug request but I couldn't find this issue anywhere else on internet, and I'm seeking for help

My object loaded which should contain 8 pages

PDFPage {
    fontSize: 24,
    fontColor: { type: 'RGB', red: 0, green: 0, blue: 0 },
    lineHeight: 24,
    x: 0,
    y: 0,
    node: PDFPageLeaf {
      dict: [Map],
      context: [PDFContext],
      normalized: false,
      autoNormalizeCTM: true
    },
    ref: PDFRef { objectNumber: 6, generationNumber: 0, tag: '6 0 R' },
    doc: PDFDocument {
      defaultWordBreaks: [Array],
      computePages: [Function (anonymous)],
      getOrCreateForm: [Function (anonymous)],
      context: [PDFContext],
      catalog: [PDFCatalog],
      isEncrypted: false,
      pageCache: [Cache],
      pageMap: [Map],
      formCache: [Cache],
      fonts: [],
      images: [],
      embeddedPages: [],
      embeddedFiles: [],
      javaScripts: [],
      pageCount: 7
    }
  },
  PDFPage {
    fontSize: 24,
    fontColor: { type: 'RGB', red: 0, green: 0, blue: 0 },
    lineHeight: 24,
    x: 0,
    y: 0,
    node: PDFPageLeaf {
      dict: [Map],
      context: [PDFContext],
      normalized: false,
      autoNormalizeCTM: true
    },
    ref: PDFRef { objectNumber: 9, generationNumber: 0, tag: '9 0 R' },
    doc: PDFDocument {
      defaultWordBreaks: [Array],
      computePages: [Function (anonymous)],
      getOrCreateForm: [Function (anonymous)],
      context: [PDFContext],
      catalog: [PDFCatalog],
      isEncrypted: false,
      pageCache: [Cache],
      pageMap: [Map],
      formCache: [Cache],
      fonts: [],
      images: [],
      embeddedPages: [],
      embeddedFiles: [],
      javaScripts: [],
      pageCount: 7
    }
  },
  PDFPage {
    fontSize: 24,
    fontColor: { type: 'RGB', red: 0, green: 0, blue: 0 },
    lineHeight: 24,
    x: 0,
    y: 0,
    node: PDFPageLeaf {
      dict: [Map],
      context: [PDFContext],
      normalized: false,
      autoNormalizeCTM: true
    },
    ref: PDFRef { objectNumber: 12, generationNumber: 0, tag: '12 0 R' },
    doc: PDFDocument {
      defaultWordBreaks: [Array],
      computePages: [Function (anonymous)],
      getOrCreateForm: [Function (anonymous)],
      context: [PDFContext],
      catalog: [PDFCatalog],
      isEncrypted: false,
      pageCache: [Cache],
      pageMap: [Map],
      formCache: [Cache],
      fonts: [],
      images: [],
      embeddedPages: [],
      embeddedFiles: [],
      javaScripts: [],
      pageCount: 7
    }
  },
  PDFPage {
    fontSize: 24,
    fontColor: { type: 'RGB', red: 0, green: 0, blue: 0 },
    lineHeight: 24,
    x: 0,
    y: 0,
    node: PDFPageLeaf {
      dict: [Map],
      context: [PDFContext],
      normalized: false,
      autoNormalizeCTM: true
    },
    ref: PDFRef { objectNumber: 15, generationNumber: 0, tag: '15 0 R' },
    doc: PDFDocument {
      defaultWordBreaks: [Array],
      computePages: [Function (anonymous)],
      getOrCreateForm: [Function (anonymous)],
      context: [PDFContext],
      catalog: [PDFCatalog],
      isEncrypted: false,
      pageCache: [Cache],
      pageMap: [Map],
      formCache: [Cache],
      fonts: [],
      images: [],
      embeddedPages: [],
      embeddedFiles: [],
      javaScripts: [],
      pageCount: 7
    }
  },
  PDFPage {
    fontSize: 24,
    fontColor: { type: 'RGB', red: 0, green: 0, blue: 0 },
    lineHeight: 24,
    x: 0,
    y: 0,
    node: PDFPageLeaf {
      dict: [Map],
      context: [PDFContext],
      normalized: false,
      autoNormalizeCTM: true
    },
    ref: PDFRef { objectNumber: 18, generationNumber: 0, tag: '18 0 R' },
    doc: PDFDocument {
      defaultWordBreaks: [Array],
      computePages: [Function (anonymous)],
      getOrCreateForm: [Function (anonymous)],
      context: [PDFContext],
      catalog: [PDFCatalog],
      isEncrypted: false,
      pageCache: [Cache],
      pageMap: [Map],
      formCache: [Cache],
      fonts: [],
      images: [],
      embeddedPages: [],
      embeddedFiles: [],
      javaScripts: [],
      pageCount: 7
    }
  },
  PDFPage {
    fontSize: 24,
    fontColor: { type: 'RGB', red: 0, green: 0, blue: 0 },
    lineHeight: 24,
    x: 0,
    y: 0,
    node: PDFPageLeaf {
      dict: [Map],
      context: [PDFContext],
      normalized: false,
      autoNormalizeCTM: true
    },
    ref: PDFRef { objectNumber: 21, generationNumber: 0, tag: '21 0 R' },
    doc: PDFDocument {
      defaultWordBreaks: [Array],
      computePages: [Function (anonymous)],
      getOrCreateForm: [Function (anonymous)],
      context: [PDFContext],
      catalog: [PDFCatalog],
      isEncrypted: false,
      pageCache: [Cache],
      pageMap: [Map],
      formCache: [Cache],
      fonts: [],
      images: [],
      embeddedPages: [],
      embeddedFiles: [],
      javaScripts: [],
      pageCount: 7
    }
  },
  PDFPage {
    fontSize: 24,
    fontColor: { type: 'RGB', red: 0, green: 0, blue: 0 },
    lineHeight: 24,
    x: 0,
    y: 0,
    node: PDFPageLeaf {
      dict: [Map],
      context: [PDFContext],
      normalized: false,
      autoNormalizeCTM: true
    },
    ref: PDFRef { objectNumber: 24, generationNumber: 0, tag: '24 0 R' },
    doc: PDFDocument {
      defaultWordBreaks: [Array],
      computePages: [Function (anonymous)],
      getOrCreateForm: [Function (anonymous)],
      context: [PDFContext],
      catalog: [PDFCatalog],
      isEncrypted: false,
      pageCache: [Cache],
      pageMap: [Map],
      formCache: [Cache],
      fonts: [],
      images: [],
      embeddedPages: [],
      embeddedFiles: [],
      javaScripts: [],
      pageCount: 7
    }
  }
]

Version

1.17.1

What environment are you running pdf-lib in?

Node

Checklist

Additional Notes

No response