aheckmann / gm

GraphicsMagick for node
http://aheckmann.github.com/gm/
6.95k stars 614 forks source link

output Text transfered to image stream when transforming PDF #654

Open joadr opened 7 years ago

joadr commented 7 years ago

Hello,

I've been working with this library for a while for transforming PDF files into image files. It works for most PDF files but some of the ones exported by Microsoft Word in Mac OSX are a little corrupted and when transforming them with graphicsmagick on my linux machine, it converts them well but it outputs this message: **** Warning: considering '0000000000 XXXXX n' as a free entry. (Once per page) . The thing is that when converting them with graphicsmagick with this extension and streaming it, like this:

gm(readStream)
.resize('200', '200')
.stream(function (err, stdout, stderr) {
  var writeStream = fs.createWriteStream('/path/to/my/resized.jpg');
  stdout.pipe(writeStream);
});

The stdout stream comes with the warning message at the beginning and images files get corrupted. Any idea how to solve this and avoid displaying the warning messages?

JaimeObregon commented 3 years ago

Happened to me too while converting PDF files such as this one to JPEG. It looks like gm is incorrectly piping stderr to the output buffer.

I managed to work around it by just trimming everything before the output format's signature bytes. Not ideal but it works:

// FF D8 FF are JPEG's signature bytes — they signal the actual start of the file
const thumbnail = buffer.slice(buffer.indexOf('FFD8FF', 0, 'hex'))