sitespeedio / chrome-har

Create HAR files from Chrome Debugging Protocol data
MIT License
148 stars 50 forks source link

Capture redirect before page #89

Closed mikedijkstra closed 2 years ago

mikedijkstra commented 2 years ago

This PR updates the case where a page is first added to populate any redirects that occurred before the page is created.

This will allow for a more accurate measurement of TTFB: (source: Stackoverflow answer)

TTFB measures the duration from the user or client making an HTTP request to the first byte of the page being received by the client's browser.

And also make the HAR more closely aligned with that generated by Chrome.

Here's a comparison of navigating to "https://google.com" (which redirects to "https://www.google.com"):

chrome-har HAR before change

before

Chrome HAR

chrome

chrome-har HAR after change

after
mikedijkstra commented 2 years ago

The google DevTools logs actually fail on the main branch and on this one. So without getting side-tracked as to why, I changed to use Vercel which has a redirect from www.vercel.com to vercel.com.

Main branch

Screen Shot 2021-10-20 at 12 54 10 pm

PR branch

Screen Shot 2021-10-20 at 12 54 03 pm
soulgalore commented 2 years ago

Hi @mikedijkstra thank you for the PR! I'll have a look later today.

soulgalore commented 2 years ago

Hmm, can you share a trace log where it doesn't work for you with the current setup? Do I understand correctly that you miss the actual redirect in the HAR (but you have it in the trace)?

When I generate a trace with Browsertime it works fine for me:

Screenshot 2021-10-20 at 10 01 04 Screenshot 2021-10-20 at 10 03 55
soulgalore commented 2 years ago

Attaching what the log looks like for me: chromePerflog-1.json.gz

mikedijkstra commented 2 years ago

@soulgalore Interesting!

You can see the log I generated for this PR here.

My logs don't start with Page.frameScheduledNavigation (which seems to be deprecated) and that's why I had to look at populating the redirect data from events that were present before a page is set.

soulgalore commented 2 years ago

Ah sorry I missed the JSON. And good catch that the Page.frameScheduledNavigation is deprecated, I'm gonna change that to Page.frameRequestedNavigation that seems to be the "new" version. They are both sent at the moment.

Is there a way you can start your trace earlier so you get the those event too? Thinking maybe the trace start to late?

mikedijkstra commented 2 years ago

@soulgalore The DevTools logs I'm pulling come from Lighthouse. I've had a dig through how it works and the simplified flow is: Start the browser, load about:blank, start recording the trace, fire Page.navigate with the URL.

This results in Network.requestWillBeSent being recorded first and then Page.frameStartedLoading is only recorded once a response has come back (as you can see in the DevTools log I linked previously).

I can "trick" it into working by starting the recording before about:blank is loaded as this command gives the Page.frameStartedLoading and Page.frameNavigated events before the Network.requestWillBeSent, but this isn't a great solution.

Do you know the command that is called in the Browsertime implementation? I'd be interested to know why I don't see the Page.frameRequestedNavigation event …

soulgalore commented 2 years ago

@mikedijkstra When I did a quick check it looked like we turn on the tracing just before we start the navigation (but we navigate with JavaScript not the navigate CPD command) and not when we access our internal start page, but could be that did read that wrong, let me have a deeper look tomorrow.

soulgalore commented 2 years ago

Lets merge this, to me it doesn't really matter but just curious why we have different logs.