galkahana / HummusJS

Node.js module for high performance creation, modification and parsing of PDF files and streams
http://www.pdfhummus.com
Other
1.14k stars 169 forks source link

Rotate a single page, using streams (for reading and writing) #283

Open kilsen opened 6 years ago

kilsen commented 6 years ago

Hello. I'm using hummus and I'm attempting to rotate a single page in an existing PDF. The PDF is coming from an in-memory object, so I'm attempting to use streams for reading and for writing, and I'm using the "CopyingContext" approach. But I suspect something is amiss, because although I'm starting with a document whose length is over 500,000 bytes, my end result is only 388 bytes. Below is my code (which borrows HEAVILY from other examples that I found here). Can you tell me if I'm doing something wrong? Thank you.

          console.log('doc length', document.content.length);

          // Create a reader to read the PDF from a stream
          const pdfReader = hummus.createReader(new hummus.PDFRStreamForBuffer(document.content));

          // Create a writer for the modified PDF
          const outputStream = new streams.WritableStream();
          const pdfWriter = hummus.createWriter(new hummus.PDFStreamForResponse(outputStream));

          // Create a copying context to modify the PDF
          const copyingContext = pdfWriter.createPDFCopyingContext(pdfReader);
          const parser = copyingContext.getSourceDocumentParser();

          // Get the page that must be modified
          const pageObjectID = parser.getPageObjectID(pageIndexToRotate);
          const page = parser.parsePage(pageIndexToRotate);

          // Get all of the dictionary entries for the page
          const pageJSObject = page.getDictionary().toJSObject();

          // Get the original rotation of the page
          const originalRotation = page.getRotate();

          // Create a new version of the page object
          const objectsContext = pdfWriter.getObjectsContext();
          objectsContext.startModifiedIndirectObject(pageObjectID);
          const modifiedPageObject = pdfWriter.getObjectsContext().startDictionary();

          // Copy all properties except rotation to the new page object
          Object.getOwnPropertyNames(pageJSObject).forEach(element => {
            if (element !== 'Rotate') {
              modifiedPageObject.writeKey(element);
              copyingContext.copyDirectObjectAsIs(pageJSObject[element]);
            }
          });

          // Set the new rotation value
          modifiedPageObject.writeKey('Rotate');
          objectsContext
            .writeNumber(originalRotation + 90 % 360)
            .endLine()
            .endDictionary(modifiedPageObject)
            .endIndirectObject();

          // Done
          pdfWriter.end();
          outputStream.end();
          document.content = outputStream.toBuffer();

          console.log('new doc length', document.content.length);
richard-kurtosys commented 6 years ago

Using your code I get the same output. The contents of the "generated file" are:

%PDF-1.4
%<BD><BE><BC>
12 0 obj
<<
        /Contents 13 0 R
        /MediaBox [ 0 0 595.2756 841.8898 ]
        /Parent 3 0 R
        /Resources 15 0 R
        /Type /Page
        /Rotate 90 
>>
endobj
1 0 obj
<<
        /Type /Catalog
>>
endobj
xref
0 2
0000000000 65535 f
0000000163 00000 n
trailer
<<
        /Size 2
        /Root 1 0 R
        /ID [ <E82CDB8BAA08EAD8568520F1BBF53333> <E82CDB8BAA08EAD8568520F1BBF53333> ]
>>
startxref
205
%%EOF

From this it looks like the "content" isn't being copied across but the "headers" and "footer" are being created.

A more complete example would be: (replace 3page.pdf with a local 2+ page pdf you have lying around)

var hummus = require('hummus');
var streams = require('memory-streams');
var fs = require("fs");

const document = fs.readFileSync('./3page.pdf');

console.log('doc length', document.length);
const pageIndexToRotate = 1;

// Create a reader to read the PDF from a stream
const pdfReader = hummus.createReader(new hummus.PDFRStreamForBuffer(document));

// Create a writer for the modified PDF
const outputStream = new streams.WritableStream();
const pdfWriter = hummus.createWriter(new hummus.PDFStreamForResponse(outputStream));

// Create a copying context to modify the PDF
const copyingContext = pdfWriter.createPDFCopyingContext(pdfReader);
const parser = copyingContext.getSourceDocumentParser();

// Get the page that must be modified
const pageObjectID = parser.getPageObjectID(pageIndexToRotate);
const page = parser.parsePage(pageIndexToRotate);

// Get all of the dictionary entries for the page
const pageJSObject = page.getDictionary().toJSObject();

// Get the original rotation of the page
const originalRotation = page.getRotate();

// Create a new version of the page object
const objectsContext = pdfWriter.getObjectsContext();
objectsContext.startModifiedIndirectObject(pageObjectID);
const modifiedPageObject = pdfWriter.getObjectsContext().startDictionary();

// Copy all properties except rotation to the new page object
Object.getOwnPropertyNames(pageJSObject).forEach(element => {
    if (element !== 'Rotate') {
        modifiedPageObject.writeKey(element);
        copyingContext.copyDirectObjectAsIs(pageJSObject[element]);
    }
});

// Set the new rotation value
modifiedPageObject.writeKey('Rotate');
objectsContext
    .writeNumber(originalRotation + 90 % 360)
    .endLine()
    .endDictionary(modifiedPageObject)
    .endIndirectObject();

// Done
pdfWriter.end();
outputStream.end();
const newDocument = outputStream.toBuffer();

console.log('new doc length', newDocument.length);

fs.writeFileSync("./rotated.pdf", outputStream.toBuffer());

I'm looking into this for a bit while something else compiles.

kilsen commented 6 years ago

Thanks for looking into this.

I think I was able to get it to work by changing the way that I set up the streams, and using createWriterToModify():

          // Create a reader to read the PDF from a stream
          const inputStream = new hummus.PDFRStreamForBuffer(document.content);
          const pdfReader = hummus.createReader(inputStream);

          // Create a writer for the modified PDF
          const outputStream = new streams.WritableStream();
          const pdfWriter = hummus.createWriterToModify(new hummus.PDFRStreamForBuffer(document.content), new hummus.PDFStreamForResponse(outputStream));

          // Create a copying context to modify the PDF
          const copyingContext = pdfWriter.createPDFCopyingContext(pdfReader);
          const parser = copyingContext.getSourceDocumentParser();
richard-kurtosys commented 6 years ago

Using that in the code snippet I pasted above does indeed make it work for me.

I'm glad you managed to get this resolved!