cyrus-and / chrome-har-capturer

Capture HAR files from a Chrome instance
MIT License
535 stars 90 forks source link

Allow for using already initialized CDP instance #78

Closed themaxdavitt closed 4 years ago

themaxdavitt commented 4 years ago

I think it might be handy to have an option in run (or maybe a separate function) allowing consumers to provide an already initialized instance of CDP (from chrome-remote-interface) to create the HAR from.

Imagine being able to write something like this:

const CDP = require('chrome-remote-interface');
const CHC = require('chrome-har-capturer');

const tab = await CDP({ target: ... });

tab.Network.enable();
tab.Page.enable();

tab.on('Network.requestWillBeSent', console.log);
CHC.run({ tab, content: true }).on('har', console.log);

await tab.Page.navigate({ url: 'https://github.com' });

I could just record events from the CDP instance and pass them to fromLog, but it's not that easy (you included a note about this in the README), you're already doing it, and depending on how we implement this, a majority of the work could just be adding an if statement around this.

I'd be happy to write up the PR if you're interested.

cyrus-and commented 4 years ago

Why not using hooks?

const CHC = require('chrome-har-capturer');

CHC.run(['https://github.com'] ,{
    content: true,
    preHook: (url, tab) => {
        tab.Network.enable();
        tab.on('Network.requestWillBeSent', console.log);
    },
    postHook: (url, tab) => {
        /*..*/
    }
}).on('har', console.log);
themaxdavitt commented 4 years ago

In hindsight I didn't write a very good example - hooks would work pretty well for that. :)

Essentially, I wish it would let me specify the target to capture the HAR from. I was trying to demonstrate a use case where we had already separately created a CDP instance and are later using it with run instead of us just creating it in run.

Instead, imagine if I had filled out a form on a page and wanted to use this package to capture the response I received after submitting it. What should I do?

cyrus-and commented 4 years ago

Sorry fo the incredibly long delay here...

Instead, imagine if I had filled out a form on a page and wanted to use this package to capture the response I received after submitting it. What should I do?

Anyway, passing the an initialised instance seems easy but the problem is that chrome-har-capturer will run Page.navigate on it to load the URL and it will wait for some events (e.g., Page.loadEventFired) to understand when the loading is finished. How should I handle this in this scenario?

Moreover the concept of page kind of loses the meaning here you're ending up capturing HAR for some other page with respect to the one initially passed to the run() method...

I'm afraid this would require a major refactoring to reuse the HAR generation part with page loads coming from basically user interaction like submitting a form or clicking a button. The main hinderance I think is the fact that the Stats object strictly maps the loading of one main URL. In fact even in the fromLog case you have to specify the URL manually.


I think one solution would be to pass POST data to the page loading so that you can specify the form parameters but AFAIK it is not possible using the protocol alone.

(Note that if the form executes a GET request you have no problems and you can already do it.)

themaxdavitt commented 4 years ago

Replying late is always better than not replying, don't beat yourself up over it. :)

I haven't worked on the project that would've used this in a little while, and you make some really good points - in hindsight, this would definitely be outside the scope of what you aimed to create. I'll close the issue - I appreciate you taking time to consider it!