traviscooper / node-wkhtml

Convert html to PDF or PNG format using the webkit rendering engine, and qt.
104 stars 17 forks source link

Return the generated pdf as buffer #5

Open mlegenhausen opened 12 years ago

mlegenhausen commented 12 years ago

It would be nice to get only the generated pdf back in a buffer. Currently you return the stdin which contains also the executed command which leads to unusable pdfs when storing the data direkt in a buffer.

mlegenhausen commented 12 years ago
var fs = require('fs');
var cp = require('child_process');

var buffertools = require('buffertools');

var buffers = [];
var wkhtmltopdf = cp.spawn('/bin/sh', ['-c', 'echo "Hello World" | wkhtmltopdf --margin-top 10 --margin-bottom 10 --margin-left 10 --margin-right 10 - -']);
wkhtmltopdf.stdout.on('data', function(data) {
    buffers.push(data);
});
wkhtmltopdf.on('exit', function() {
    var buffer = buffertools.concat.apply(buffertools, buffers);
    fs.writeFile('output.pdf', buffer);
});

An example how to save the data as buffer before it is converted to a string by exec. I am using buffertools for concatination.

mhemesath commented 12 years ago

Thanks for the feedback. I think what I'm going to do is simplify this module down into just a convenience method for spawning wkhtmltopdf processes. This was recently discussed in https://github.com/mhemesath/node-wkhtml/issues/6.

I'm still on the fence on whether or not generating files should be a part of this module. I think the majority use case will be piping the generated PDF to a response.. but I could be wrong?

mlegenhausen commented 12 years ago

I boiled it down to the following:

require('sugar');

var cp = require('child_process');
var util = require('util');

var buffertools = require('buffertools');

module.exports.spawn = function(options) {
    var commands = [];
    options.pdf = options.pdf || {
        'margin-top': 0,
        'margin-bottom': 0,
        'margin-left': 0,
        'margin-right': 0
    };
    Object.each(options.pdf, function(key, value) {
        commands.add(['--' + key, value]);
    });
    commands.add(['-q', '-', '-']);
    var streams = cp.spawn('wkhtmltopdf', commands, options.process || {});
    streams.stdout.toBuffer = function(callback) {
        var stream = new buffertools.WritableBufferStream();
        util.pump(streams.stdout, stream, function(err) {
            if (err) callback(err);
            return callback(null, stream.getBuffer());
        });
    };
    return streams;
};

module.exports.stringToStream = function(str, options) {
    var wkhtmltopdf = this.spawn(options);
    wkhtmltopdf.stdin.end(str, 'utf8');
    return wkhtmltopdf.stdout;
};

So it is possible to have access to the streams (for piping) and can convert it in a buffer or can simple take a string (html) and convert it to a stream or buffer.

mhemesath commented 12 years ago

I went ahead and pushed a major refactor to the code. It's still in "alpha", but I'm hoping to get some tests and more documentation soon to push a good stable release. I went ahead and just exposed a method for spawning the generatred PDF and provided examples of effectively piping it to a response, or to write to the filesystem.

I'd like to discourage buffering the output, as the generated PDFs can take up a bit of memory and I think in most cases it makes more sense to just stream it to its end point. Let me know if you have a use case that piping the output to the response/filesystem can't solve and I'll consider adding a buffer method.

Thanks!

mlegenhausen commented 12 years ago

The use case where I need the buffer is storing the pdf data directly in a mongodb document. Which only supports Buffers.

mhemesath commented 12 years ago

Ok, that makes sense. I think I'm going to break this into its own method.