sitespeedio / chrome-har

Create HAR files from Chrome Debugging Protocol data
MIT License
149 stars 50 forks source link

Base page is replace by the first ressource loaded in HAR #15

Closed vroumvroum closed 6 years ago

vroumvroum commented 6 years ago

Bellow example is based on code found here : https://michaljanaszek.com/blog/generate-har-with-puppeteer

Page section is showing first ressource loaded and not the base url. Example with en.wikipedia.org

image

image

Consequently, I can't find initial page or ressource in the HAR.

chrome-har : 0.3.1 puppeteer : 1.2.0 (with chromium V 67.0.3372.0)

soulgalore commented 6 years ago

Hey @vroumvroum hmm I got the same in Chrome 66 and did this fix for that: https://github.com/sitespeedio/chrome-har/issues/12

I wonder if it has changed in 67 too? Can you provide the full trace log in a gist, then I can have a look and see what's going on.

Best Peter

vroumvroum commented 6 years ago

Hey @soulgalore, is this what you are expecting : https://gist.github.com/vroumvroum/cda71d0c4e9efb22ced5eb4d5917b0e2 ?

soulgalore commented 6 years ago

@vroumvroum yep almost, anyway you could skip the puppeteer log and get the raw one? We got a couple of fixed coming up the coming days but the more logs that breaks, the better so we are sure we fix it.

vroumvroum commented 6 years ago

Gladly @soulgalore, but where can I find it ? (I have been using puppeteer since yesterday only ...)

Le ven. 6 avr. 2018 12:46, Peter Hedenskog notifications@github.com a écrit :

@vroumvroum https://github.com/vroumvroum yep almost, anyway you could skip the puppeteer log and get the raw one? We got a couple of fixed coming up the coming days but the more logs that breaks, the better so we are sure we fix it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sitespeedio/chrome-har/issues/15#issuecomment-379217685, or mute the thread https://github.com/notifications/unsubscribe-auth/AFb-cmx2Ni_gzndXTfM1kiyOVDc6xcbIks5tl0d4gaJpZM4TIwVn .

soulgalore commented 6 years ago

Ah I see. In the example: https://michaljanaszek.com/blog/generate-har-with-puppeteer

// convert events to HAR file
  const har = harFromMessages(events);
  await promisify(fs.writeFile)('en.wikipedia.org.har', JSON.stringify(har));

Skip the conversion and store the raw events to disk, something like:

  await promisify(fs.writeFile)('en.wikipedia.org.json', JSON.stringify(events));
tobli commented 6 years ago

It would also be helpful if you could enable debug logging in chrome-har. Just set the environment variable DEBUG=chrome-har before running.

suever commented 6 years ago

It's worth noting that the blog post linked to by the OP doesn't include the Page.frameScheduledNavigation event which is being handled as of #12

vroumvroum commented 6 years ago

@soulgalore : here is the raw output of the events : https://gist.github.com/vroumvroum/3106072293a061040d498839e212dc42

@tobli : tried to generate har with DEBUG=chrome-har (output was empty)

@suever : I did tried to add Page.frameScheduledNavigation in the list of the events beforehand and it changed nothing ...

vroumvroum commented 6 years ago

@soulgalore : if it is of any help, here is the raw events with Page.frameScheduledNavigation added on the observers list : https://gist.github.com/vroumvroum/fcb9200953e499e988819cbec3ee226d

tobli commented 6 years ago

Hi @vroumvroum! It think your method of extracting Chrome DevTools Protocol events via Puppeteer isn't 100% at the moment. Events seem to be written out of order, which makes chrome-har not work. Comparing to the perflog dump that Browsertime generates (see https://gist.github.com/tobli/3d76c24911a6ab073fe5e6916d173c7b) you see that Network events (such as Network.requestWillBeSent) are printed before Page events (such as Page.frameStartedLoading) in your example.

It would likely be possible for the HAR building algorithm to work independently of event order, but it would require some non-trivial restructuring. If you'd like to dig into that we're definitely open to PRs. I'm closing this issue now, but you could open a new issue for discussing approaches to making the algorithm more "Puppeteer friendly". =)

vroumvroum commented 6 years ago

Hi @tobli, @soulgalore : thanks for your help ! I will inform the author of this code. For information, puppeter-har module (https://www.npmjs.com/package/puppeteer-har) is also based on the same code.