modesty / pdf2json

converts binary PDF to JSON and text, for server-side PDF processing and command-line use.
https://github.com/modesty/pdf2json
Other
1.98k stars 378 forks source link

"An error occurred while rendering the page" when page contains image. #7

Closed elslooo closed 11 years ago

elslooo commented 11 years ago

I forked the repo in order to inspect the exact error:

+ nodeUtil._logN.call(self, 'Error: ' + require('util').inspect(error, null, null));

The problem is: { message: 'Image is not defined', stack: 'ReferenceError: Image is not defined\n at loadJpegStream (eval at (/Users/Tim/EG Server/Source/Engine/eg-exam/node_modules/pdf2json/pdf.js:46:6))' }

I'm looking into this issue and will add a pull request when I fixed it. :)

elslooo commented 11 years ago

The problem is quite obvious actually:

function loadJpegStream(id, imageData, objs) {
  var img = new Image();
  img.onload = (function loadJpegStream_onloadClosure() {
    objs.resolve(id, img);
  });
  img.src = 'data:image/jpeg;base64,' + window.btoa(imageData);
}

This only works in browsers of course, node does not implement an Image class. Out-commenting that block of code obviously also causes problems, because that way, the callback (objs.resolve(...)) won't ever get called.

That's why I'm creating a fake "Image" class in pdf2json/pdf.js.