galkahana / HummusJS

Node.js module for high performance creation, modification and parsing of PDF files and streams
http://www.pdfhummus.com
Other
1.14k stars 169 forks source link

segfault #227

Open nnnikolay opened 6 years ago

nnnikolay commented 6 years ago

Hi Gal,

First of all, thanks for your package. But I'm experiencing a segfault, and not every time.

I've node v8.9.1 and node-gyp v3.6.2.

this is my code

var pdfWriter = hummus.createWriter(__dirname + '/../data/output.pdf');
var page = pdfWriter.createPage(0,0,595,842);

pdfWriter.mergePDFPagesToPage(page,
  __dirname + '/../data/input-3.pdf',
  {
    type:hummus.eRangeTypeSpecific,
    specificRanges:[[0,0]]
  });

pdfWriter.mergePDFPagesToPage(page,
  __dirname + '/../data/input-8.pdf',
  {
    type:hummus.eRangeTypeSpecific,
    specificRanges:[[0,0]]
  });

pdfWriter.writePage(page).end();

this is my execution flow

 node build/index.js
 node build/index.js
 node build/index.js
 node build/index.js
 node build/index.js
 node build/index.js
[2]    4326 segmentation fault  node build/index.js
 node build/index.js
[2]    4354 segmentation fault  node build/index.js
 node build/index.js
 node build/index.js
[2]    4410 segmentation fault  node build/index.js
 node build/index.js

as you can see I'm running the same code multiple times and receive sometimes positive result sometimes segfault.

meanwhile running all your tests multiple times does not raise any segfault. It looks like that the problem is in the pdf files (my pdf files)?

with the segfault-handler module I was able to catch the log (if it's useful):

PID 7121 received SIGSEGV for address: 0x0
0   segfault-handler.node               0x00000001037c8168 _ZL16segfault_handleriP9__siginfoPv + 280
1   libsystem_platform.dylib            0x00007fff51381f5a _sigtramp + 26
2   ???                                 0x0000000103751080 0x0 + 4352970880
3   libsystem_malloc.dylib              0x00007fff512a9403 szone_malloc_should_clear + 422
4   libsystem_malloc.dylib              0x00007fff512a9201 malloc_zone_malloc + 103
5   libsystem_malloc.dylib              0x00007fff512a850b malloc + 24
6   libc++abi.dylib                     0x00007fff4f18f628 _Znwm + 40
7   hummus.node                         0x00000001079c2ed8 _ZN15XCryptionCommon12algorithm3_1EmmRKNSt3__14listIhNS0_9allocatorIhEEEEb + 380
8   hummus.node                         0x00000001079c2d01 _ZN15XCryptionCommon13OnObjectStartExx + 45
9   hummus.node                         0x000000010795cf2a _ZN16DecryptionHelper13OnObjectStartExx + 46
10  hummus.node                         0x0000000107996576 _ZN9PDFParser27ParseExistingInDirectObjectEm + 772
11  hummus.node                         0x000000010799477d _ZN9PDFParser19ParsePagesObjectIDsEv + 141
12  hummus.node                         0x0000000107987b3f _ZN18PDFDocumentHandler19StartCopyingContextEP23IByteReaderWithPositionRK17PDFParsingOptions + 95
13  hummus.node                         0x0000000107987f24 _ZN18PDFDocumentHandler19MergePDFPagesToPageEP7PDFPageRKNSt3__112basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEERK17PDFParsingOptionsRK12PDFPageRangeRKNS2_4listImNS6_ImEEEE + 34
14  hummus.node                         0x0000000107930c08 _ZN15PDFWriterDriver19MergePDFPagesToPageERKN2v820FunctionCallbackInfoINS0_5ValueEEE + 2148
15  node                                0x00000001001d1bfc _ZN2v88internal25FunctionCallbackArguments4CallEPFvRKNS_20FunctionCallbackInfoINS_5ValueEEEE + 430
16  node                                0x000000010021b1e3 _ZN2v88internal12_GLOBAL__N_119HandleApiCallHelperILb0EEENS0_11MaybeHandleINS0_6ObjectEEEPNS0_7IsolateENS0_6HandleINS0_10HeapObjectEEESA_NS8_INS0_20FunctionTemplateInfoEEENS8_IS4_EENS0_16BuiltinArgumentsE + 775
17  node                                0x000000010021a906 _ZN2v88internalL26Builtin_Impl_HandleApiCallENS0_16BuiltinArgumentsEPNS0_7IsolateE + 259
18  ???                                 0x000032804478463d 0x0 + 55526485935677
19  ???                                 0x0000328044874048 0x0 + 55526486917192
20  ???                                 0x000032804483f9ce 0x0 + 55526486702542
chunyenHuang commented 6 years ago

Do you mind also attach your source PDFs here for test?

nnnikolay commented 6 years ago

@chunyenHuang unfortunately, I can't share it here. Is there any way to find out what is wrong with it? I'm pretty sure that the problem is in the file, but I've no idea what exactly wrong with it. This is what "CMD + I" shows me in that file:

ss

this is the input-3.pdf from my source code

and this is inspector window from Preview

ss2

Can the encryption cause such issue?

Thanks

chunyenHuang commented 6 years ago

Well, I actually dont need your input (sry). segmentation fault is usually caused by unsolved path. So I highly recommend to simply your code and file paths, and then test again.

var hummus = require('hummus')
var path = require('path');

// put in the same directory and test first.
var output = path.join(__dirname, 'data/output.pdf');
var inputA = path.join(__dirname, 'data/input-3.pdf');
var inputB = path.join(__dirname, 'data/input-8.pdf');

var pdfWriter = hummus.createWriter(__dirname + '/../data/output.pdf');
var page = pdfWriter.createPage(0,0,595,842);

pdfWriter.mergePDFPagesToPage(page, inputA,
  {
    type:hummus.eRangeTypeSpecific,
    specificRanges:[[0,0]]
  });

pdfWriter.mergePDFPagesToPage(page, inputBm
  {
    type:hummus.eRangeTypeSpecific,
    specificRanges:[[0,0]]
  });

pdfWriter.writePage(page).end();

If not, you may try check isEncrypted() and take look HERE, Xcryption.js

nnnikolay commented 6 years ago

Thanks for the suggestions, I've tried all of them and came to the conclusion that the issue is in the file.

If I use another file everything is fine. But with that file, even the following code is crashing 99% times:

var pdfReader = hummus.createReader(path.join(__dirname, 'input-3.pdf'));

console.log(pdfReader.getPagesCount());

I'm pretty much sure that it is a bad pdf file, I've tried many different files what I was able to find out and all of them are working good so far.

If the input file can help you to find out what is the problem I can send you it via email?

chunyenHuang commented 6 years ago

Well, all I can tell you now is that the hummus.createReader does not handle the permission userProtectionFlag for this pdf well. So the easy fix for you for now is to hummus.recrypt and use the new recrypted pdf file. This will require a PR to fix the reader issue.

ryanbennettvoid commented 6 years ago

I had an issue where Node exited with code 139 when I evoked pdfWriter.getFontForFile() with a non-existent file. Fixed the path and it worked fine afterwards.