cyrus-and / chrome-har-capturer

Capture HAR files from a Chrome instance
MIT License
530 stars 90 forks source link

is it possible to collect metrics using PerformanceObserver? #93

Closed itsrun closed 2 years ago

itsrun commented 2 years ago

hi, I'm new to devtool protocol and I was wondering if it is possible to log some important metrics such as largest-contentful-paint with PerformanceObserver APIs using chrome-har-capturer or chrome-remote-interface? Appreciate any help or suggestion, thanks.

cyrus-and commented 2 years ago

You should be able to add custom metrics to the HAR using hooks (specifically postHook). You can find the documentation in the README and an example usage in the bundled tool:

https://github.com/cyrus-and/chrome-har-capturer/blob/f7b983a01e0884eda3a527d8f507103538937447/bin/cli.js#L93-L116

In a nutshell:

  1. optionally define a preHook to set up things;

  2. define a postHook to fetch the actual metrics and return some value;

  3. that value will be placed in the _user key of the the page object.

As for the performance metrics themselves, you can either fetch them from the PerformanceObserver API from JavaScript using Runtime.evaluate, or using the Performance domain of the protocol. I'm not sure if there is a certain overlap between the two.

See for example:

``` >>> Performance.getMetrics() { metrics: [ { name: 'Timestamp', value: 222831.772791 }, { name: 'AudioHandlers', value: 0 }, { name: 'Documents', value: 2 }, { name: 'Frames', value: 2 }, { name: 'JSEventListeners', value: 0 }, { name: 'LayoutObjects', value: 12 }, { name: 'MediaKeySessions', value: 0 }, { name: 'MediaKeys', value: 0 }, { name: 'Nodes', value: 39 }, { name: 'Resources', value: 1 }, { name: 'ContextLifecycleStateObservers', value: 2 }, { name: 'V8PerContextDatas', value: 1 }, { name: 'WorkerGlobalScopes', value: 0 }, { name: 'UACSSResources', value: 0 }, { name: 'RTCPeerConnections', value: 0 }, { name: 'ResourceFetchers', value: 2 }, { name: 'AdSubframes', value: 0 }, { name: 'DetachedScriptStates', value: 0 }, { name: 'ArrayBufferContents', value: 0 }, { name: 'LayoutCount', value: 7 }, { name: 'RecalcStyleCount', value: 7 }, { name: 'LayoutDuration', value: 0.032676 }, { name: 'RecalcStyleDuration', value: 0.004129 }, { name: 'DevToolsCommandDuration', value: 0.000042 }, { name: 'ScriptDuration', value: 0 }, { name: 'V8CompileDuration', value: 0 }, { name: 'TaskDuration', value: 0.054151 }, { name: 'TaskOtherDuration', value: 0.017304 }, { name: 'ThreadTime', value: 0.051059 }, { name: 'ProcessTime', value: 0.174117 }, { name: 'JSHeapUsedSize', value: 1017424 }, { name: 'JSHeapTotalSize', value: 2080768 }, { name: 'FirstMeaningfulPaint', value: 0 }, { name: 'DomContentLoaded', value: 222803.850626 }, { name: 'NavigationStart', value: 222803.361902 } ] } ```

Hope it helps!

itsrun commented 2 years ago

Thanks for your suggestion!

As for the performance metrics themselves, you can either fetch them from the PerformanceObserver API from JavaScript using Runtime.evaluate, or using the Performance domain of the protocol. I'm not sure if there is a certain overlap between the two.

However I still can't get it to work. Here's what I've tried:

  1. Fetch LCP using PerformanceTimeline domain. I tried client.PerformanceTimeline but it is undefined. is it because chrome-remote-interface doesn't support this domain yet?
  2. Using Runtime.evaluate. I don't know if I've done it right:
    
    const getObserver = (type, callback) => {
    const perfObserver = new PerformanceObserver((entryList) => {
        callback(entryList.getEntries())
    });
    perfObserver.observe({type, buffered: true});
    }

const getPaintTime = () => { getObserver('paint', entries => { entries.forEach(entry => { testdata[entry.name] = entry.startTime; }) }) return testdata; }

function generatePostHook(userMetric) { return async (url, client) => { await client.Runtime.evaluate({expression: "let testdata = {}"}); await client.Runtime.evaluate({ expression: const getObserver = ${getObserver.toString()}, }); let {result, exceptionDetails} = await client.Runtime.evaluate({ expression: (${getPaintTime.toString()})(), returnByValue: true, awaitPromise: true }); console.log(result.value); .....


The result (testdata) is empty.
cyrus-and commented 2 years ago

I didn't mean to modify the bundled tool, I mean, you sure can, but this module is also a library, I suggest to implement your own solution. Or, in case of Runtime.evaluate, there is also the --userMetric option that can be used to inject some simple JavaScript code. Anyway, whatever works for you! :)

1. Fetch LCP using PerformanceTimeline domain. I tried client.PerformanceTimeline but it is undefined. is it because chrome-remote-interface doesn't support this domain yet?

Yes, so basically chrome-har-capturer never fetches the protocol from Chrome and uses the one bundled in chrome-remote-interface, which was quite old actually. I've just updated it, so please fetch the latest version of chrome-har-capturer (v0.13.10).

Here's what works for me:

timeline.js

const CHC = require('chrome-har-capturer');

async function postHook(url, client) {
    try {
        await client.PerformanceTimeline.enable({
            eventTypes: ['largest-contentful-paint']
        });
        const {event} = await client.PerformanceTimeline.timelineEventAdded();
        return event;
    } catch (err) {
        console.error(err);
        return undefined;
    }
}

CHC.run(['https://example.com'], {
    postHook
}).on('fail', (url, err) => {
    console.error(url, err);
}).on('har', (har) => {
    console.log(JSON.stringify(har, null, 4));
});

Then run it as:

$ node timeline.js -t 127.0.0.1 | jq '.log.pages[]._user'
{
  "frameId": "5F799F6930049AE0F22C5B6726A667D4",
  "type": "largest-contentful-paint",
  "name": "",
  "time": 1644063233.0232,
  "lcpDetails": {
    "renderTime": 1644063233.0232,
    "loadTime": 0,
    "size": 20160,
    "nodeId": 3
  }
}

2. Using Runtime.evaluate. I don't know if I've done it right:

The metrics are populated asynchronously but you immediately return the testdata value which is still empty, a timeout might do, but the proper way would be to wait for the callback to be over. For example:

observer.js:

function observe(type) {
    return new Promise((fulfill) => {
        const observer = new PerformanceObserver((list) => {
            const metrics = {};
            for (const {name, startTime} of list.getEntries()) {
                metrics[name] = startTime;
            }
            fulfill(metrics);
        });
        observer.observe({type, buffered: true});
    });
}

observe('paint');

Then run it as:

$ chrome-har-capturer --userMetric "$(<observer.js)" https://example.com | jq '.log.pages[]._user'
- https://example.com/ ✓
{
  "first-paint": 620.0999999940395,
  "first-contentful-paint": 620.0999999940395
}

But this is not the Largest Contentful Paint, you need to alter the code in observer.js in order to fetch that. Something like:

observer-lcp.js

function observe(type) {
    return new Promise((fulfill) => {
        const observer = new PerformanceObserver((list) => {
            const metrics = list.getEntries().map((entry) => {
                return JSON.parse(JSON.stringify(entry));
            });
            fulfill(metrics);
        });
        observer.observe({type, buffered: true});
    });
}

observe('largest-contentful-paint');

(Note the JSON nonsense, I have no idea why it is needed...)

Then run it as:

$ chrome-har-capturer --userMetric "$(<observer-lcp.js)" https://example.com | jq '.log.pages[]._user'
- https://example.com/ ✓
[
  {
    "name": "",
    "entryType": "largest-contentful-paint",
    "startTime": 502.699,
    "duration": 0,
    "size": 20160,
    "renderTime": 502.699,
    "loadTime": 0,
    "firstAnimatedFrameTime": 0,
    "id": "",
    "url": ""
  }
]

Woah, that was quite a ride... hope it helps!

itsrun commented 2 years ago

it helps A LOT! Thank you so much for such detailed answer (or tutorial actually lol)