cyrus-and / chrome-har-capturer

Capture HAR files from a Chrome instance
MIT License
530 stars 90 forks source link

_initiator.stack.parent properties missing from the HAR output #94

Closed taavikalvi closed 1 year ago

taavikalvi commented 1 year ago

Hey,

I noticed that stack.parent properties are currently missing from the HAR output generated by chrome-har-capturer.

But if I use "Copy all as HAR" from the network tab in Chrome, I see all of them.

My Chrome version: 109.0.5414.119

I guess, this is related to the following Chrome update: https://developer.chrome.com/blog/new-in-devtools-107/#har Bug report: https://bugs.chromium.org/p/chromium/issues/detail?id=1343185

More details: https://chromedevtools.github.io/devtools-protocol/v8/Runtime/#type-StackTrace

Here's an example of how it looks like:

      {
        "_initiator": {
          "type": "script",
          "stack": {
            "callFrames": [],
            "parent": {
              "description": "Image",
              "callFrames": [
                {
                  "functionName": "r",
                  "scriptId": "55",
                  "url": "https://www.googleadservices.com/pagead/conversion/123456/?random=1676431475187",
                  "lineNumber": 0,
                  "columnNumber": 759
                },
                {
                  "functionName": "",
                  "scriptId": "55",
                  "url": "https://www.googleadservices.com/pagead/conversion/123456/?random=1676431475187",
                  "lineNumber": 0,
                  "columnNumber": 991
                },
                {
                  "functionName": "",
                  "scriptId": "55",
                  "url": "https://www.googleadservices.com/pagead/conversion/123456/?random=1676431475187",
                  "lineNumber": 0,
                  "columnNumber": 2136
                }
              ]
            }
          }
        },

Technically those parents can also be nested like this: initiator.stack.parent.parent.parent etc

Is it somehow possible to add those stack parents to HAR output generated by chrome-har-capturer?

cyrus-and commented 1 year ago

They are already there. The initiator field of the Network.requestWillBeSent event is returned as it is through the _initiator property of each entry.

Try with:

chrome-har-capturer https://ansa.it | jq '.log.entries[]._initiator | select(.type == "script")'
cyrus-and commented 1 year ago

If something's missing then it's likely a Chrome bug.

taavikalvi commented 1 year ago

What do you see as a result if you run this command? Do you see any parent properties?

Here's what I see:

https://s3.amazonaws.com/appforest_uf/f1676596330814x730015570829774600/v1.json

There are 0 parent properties. But if I copy the HAR from the network tab directly, I get this outcome:

https://s3.amazonaws.com/appforest_uf/f1676596337805x233479851433486820/v2.json

There are actually 67 parent properties available. If you search "parent", you'll see.

I updated by Chrome browser today as well and now it's version 110.0.5481.100. But no difference.

taavikalvi commented 1 year ago

Alright, if I turn on the devtools in Puppeteer and don't run it headless, it works and parent properties are visible/available. Kind of weird. But if I turn the devtools off or keep it headless, the issue is will be there.

This will give me the expected output:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({ devtools: true });
  const page = await browser.newPage();
  await page.setRequestInterception(true);

  page.on('request', async request => {
    const initiator = await request.initiator();
    console.dir(initiator);
    request.continue();
  });

  await page.goto('https://ansa.it', {waitUntil: "load"});

  await new Promise(resolve => setTimeout(resolve, 7000));

  await browser.close();
})();
cyrus-and commented 1 year ago

Oh I apologize I completely disregarded the parent part, I thought you were referring to the lack of callFrames. Looking into it...

cyrus-and commented 1 year ago

OK so, you can have all that even in headless mode. The reason why it works in the above conditions is that by opening DevTools you implicitly send some debugger commands, among the other things, the ones marked by FIX in the below script that reproduces/fixes this issue, and that they are needed to enable that feature.

const CDP = require('chrome-remote-interface');

(async () => {
    const client = await CDP();
    const {Debugger, Network, Page} = client;

    Network.requestWillBeSent(({initiator}) => {
        const {type, stack} = initiator;
        if (type === 'script' && stack.parent) {
            console.log(initiator);
        }
    });

    await Network.enable();
    await Page.enable();

    // FIX /////////////////////////////////////////////////////
    await Debugger.enable();
    await Debugger.setAsyncCallStackDepth({
        maxDepth: 32 // value from DevTools
    });
    ////////////////////////////////////////////////////////////

    await Page.navigate({url: 'https://ansa.it'})
    await Page.loadEventFired();
    await client.close();
})();

I'm not sure if I should add a flag to the command line utility, but you can surely implement it using this module as a library using hooks:

const CHC = require('chrome-har-capturer');

CHC.run([
    'https://ansa.it'
], {
    preHook: async (url, client) => {
        const {Debugger} = client;
        await Debugger.enable();
        await Debugger.setAsyncCallStackDepth({
            maxDepth: 32 // value from DevTools
        });
    }
}).on('har', (har) => {
    const json = JSON.stringify(har, null, 4);
    process.stdout.write(`${json}\n`);
});
taavikalvi commented 1 year ago

Thanks a lot. It worked.