brenden / node-webshot

Easy website screenshots in Node.js
2.12k stars 286 forks source link

Option for using webshot to retrieve rendered html #99

Open johnaschroeder opened 9 years ago

johnaschroeder commented 9 years ago

Not really an issue, but I wanted to see if there was interest in integrating some functionality I added in my fork of the project. If so, happy to submit a pull request. If it is too far afield from taking webshots, no worries either.

In addition to returning an image, I needed to capture the rendered html from a page. It's actually really handy to be able to get back the html fully rendered by webkit, vs. something like JSDOM or Cheerio. There are some node wrappers available, and one bridge that seemed too fragile/complicated to put in production, but I don't think there is anything else on github that is this simple to use. So I thought I'd put it out there.

The basic changes are minor:

webshot.js:

-      s.emit('data', new Buffer(''+data, 'base64'));
+      s.emit('data', new Buffer(''+data, 'utf8'));

webshot.phantom.js:

     if (args.streaming === 'false') {
       page.render(args.path);
     } else {
-      console.log(page.renderBase64(args.streamType));
+      console.log(page.content);
     }

     page.close();
LorenzGardner commented 9 years ago

What happens if the page isn't encoded in utf8?

johnaschroeder commented 9 years ago

I believe what we're emitting is the html page source, ie text. That text is buffered and streamed back to the node function that called it. So even if the html specifies some other wacky encoding in the header or page, isn't that a rendering issue for the browser? utf8 is just telling the buffer to encode the string we need to stream back to webshot instead of a base64 encoded image. Let me know if i've got that wrong though...

Chazaam commented 7 years ago

I johnaschroeder. Does this solution works well, or do you found a better way? I want to get the html code and an extract jpeg possibly using just one library...