sitespeedio / chrome-har

Create HAR files from Chrome Debugging Protocol data
MIT License
149 stars 50 forks source link

Support response bodies in HARs #41

Closed michaelcypher closed 5 years ago

michaelcypher commented 5 years ago

Addresses #8 by setting entry.response.content.text to the response.body field within populateEntryFromResponse. Requires the user of the library to set the response.body themselves once the response has finished loading.

Adds the includeTextFromResponseBody, so users can determine if they want the response body in the HARs, as these could significantly increase the size of the HARs. Defaults to `false.

soulgalore commented 5 years ago

Hi @mikeecb cool! Can you fix the linting problem and make Travis pass + add a couple of lines of how to use it in the README and I'll merge asap!

Best Peter

michaelcypher commented 5 years ago

@soulgalore thanks for looking so quickly! I've fixed the linting issues and updated the README.md.

soulgalore commented 5 years ago

thank you @mikeecb !!!!

kmicol commented 5 years ago

Thank you @mikeecb ! Have been following this for some time, very excited to implement. I am having a hard time setting this up in my code, is there any way someone could demonstrate how to implement using this example? https://michaljanaszek.com/blog/generate-har-with-puppeteer

It appears to be very similar to the example in the read me, but I'm having some trouble. Any help would be greatly appreciated.

michaelcypher commented 5 years ago

@kmicol, I'm sorry about the confusion, it appears the README.md example I added is not quite correct. I opened a pull (#42) request to fix the example (cc @soulgalore). Here's how to implement this using the example linked above.

const fs = require('fs');                                                                                                                                                                                                                                                                                                         [2/6256]
const { promisify } = require('util');

const puppeteer = require('puppeteer');  
const { harFromMessages } = require('chrome-har');

// list of events for converting to HAR
const events = [];

// list of promises that get the response body for a given response event
// (Network.responseReceived) and that add it to the event. These must all be
// resolved/rejected before we create the HAR from these events using
// chrome-har.
const addResponseBodyPromises: Array<Promise<void>> = [];

// event types to observe
const observe = [
  'Page.loadEventFired',
  'Page.domContentEventFired',
  'Page.frameStartedLoading',
  'Page.frameAttached',
  'Network.requestWillBeSent',
  'Network.requestServedFromCache',
  'Network.dataReceived',
  'Network.responseReceived',
  'Network.resourceChangedPriority',
  'Network.loadingFinished',
  'Network.loadingFailed',
];

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // register events listeners
  const client = await page.target().createCDPSession();
  await client.send('Page.enable');
  await client.send('Network.enable');
  observe.forEach(method => {
    client.on(method, params => {
      // push the event onto the array of events first, before potentially
      // blocking while fetching the response body, so the events remain in
      // order. This is required by chrome-har.
      const harEvent = { method, params };
      events.push(harEvent);

      if (method === 'Network.responseReceived') {
        const response = harEvent.params.response;
        const requestId = harEvent.params.requestId;
        // response body is unavailable for redirects, no-content, image, audio
        // and video responses
        if (response.status !== 204 &&
            response.headers.location == null &&
            !response.mimeType.includes('image') &&
            !response.mimeType.includes('audio') &&
            !response.mimeType.includes('video')
        ) {
          const addResponseBodyPromise = client.send(
            'Network.getResponseBody',
            { requestId },
          ).then((responseBody) => {
            // set the response so chrome-har can add it to the HAR
            harEvent.params.response = {
              ...response,
              body: new Buffer(
                responseBody.body,
                responseBody.base64Encoded ? 'base64' : undefined,
              ).toString(),
            };
          }, (reason) => {
            // resources (i.e. response bodies) are flushed after page commits
            // navigation and we are no longer able to retrieve them. In this
            // case, fail soft so we still add the rest of the response to the
            // HAR.
          });
          addResponseBodyPromises.push(addResponseBodyPromise);
        }
      }
    });
  });

  // perform tests
  await page.goto('https://en.wikipedia.org');
  page.click('#n-help > a');
  await page.waitForNavigation({ waitUntil: 'networkidle2' });
  await browser.close();

  // wait for the response body to be added to all of the
  // Network.responseReceived events before passing them to chrome-har to be
  // converted into a HAR.
        await Promise.all(addResponseBodyPromises);
  // convert events to HAR file
  const har = harFromMessages(events);
  await promisify(fs.writeFile)('en.wikipedia.org.har', JSON.stringify(har));
})();

Let me know if you have any questions and please comment on the newly opened pull request if this turns out to not work as expected!

kmicol commented 5 years ago

@mikeecb You are a prince for this one, thanks for the quick reply.

happywhitelake commented 5 years ago

@mikeecb, thanks for sharing the script. I got this warning DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.. So I guess the current new Buffer should be changed to Buffer.from() rite?

Also, I wonder why the har genereated using the above script has much smaller size compare to the size of har file saved directly from my chrom browser. The above code supposed to store all response bodies in the har file rite? In addition, the har file generated using the above code cannot be read by Chrome browser when drap/drop into the debug screen.

screen shot 2019-02-28 at 11 06 40 pm
happywhitelake commented 5 years ago

Another issue with the above script was that it seems to miss the very first Network.responseReceived of html file fetching events. I tries printing our all the events cautch by the above script and notice that it seems the listener is setup too late to catch the first HTML body.

Screen Shot 2019-03-12 at 11 05 31 AM
AgainPsychoX commented 5 years ago

I occured same issue, @phongiswindy . Here is related issue on Chromium bugs reporting site. You can up vote it.

Still you can use softwareishard.com HAR Viewer to view the HAR file.

vroumvroum commented 5 years ago

Found here (https://github.com/ChromeDevTools/devtools-protocol/issues/44) that we should wait for javascript Network.loadingFinished event before getting the body.

I added on event on the above code so that we have the body everytime. Don't know if there is a unwanted side effect to my addition though.


        [...]

        const EventEmitter = require('events');
        const emitter = new EventEmitter();

        observe.forEach(method => {
            console.log(method);
            client.on(method, params => {

           [...]

                if (harEvent.method === 'Network.loadingFinished') {
                    emitter.emit(params.requestId);
                }

               [...]
                        emitter.on(requestId, () => {
                            const addResponseBodyPromise = client.send(
                                'Network.getResponseBody', {
                                    requestId
                                },
                            ).then((responseBody) => {

                           [...]