webmetrics / browsermob-proxy

NOTICE: this project has been forked and is being maintained at https://github.com/lightbody/browsermob-proxy
https://github.com/lightbody/browsermob-proxy
Apache License 2.0
234 stars 773 forks source link

Capture cookies to HAR #5

Closed gomezd closed 13 years ago

gomezd commented 13 years ago

Capture cookies to HAR log per request Catch UserAgent parse errors and warn about them

lightbody commented 13 years ago

Not 100% sure about this changeset: you are setting up the CookieStore, but check out BrowserMobHttpClient.prepareForBrowser(), which also sets up a cookie. It is invoked by ProxyServer:

https://github.com/gomezd/browsermob-proxy/blob/master/src/main/java/org/browsermob/proxy/ProxyServer.java#L49

Isn't that code in conflict with this patch? Is it possible to change prepareForBrowser() to use something other than a BlankCookieStore (like a BasicCookieStore) but do it here:

https://github.com/gomezd/browsermob-proxy/blob/master/src/main/java/org/browsermob/proxy/http/BrowserMobHttpClient.java#L869

Let me know what you think. Thanks!

gomezd commented 13 years ago

I think if you do it in prepareForBrowser you get one cookie store per proxy server so you get the cookies for the complete session, but the way i did it, attaching it to the http context allows us to get cookies per request, kindof like firebug does. So in that sense i don't think it conflicts with prepareForBrowser code, which is actually not doing anything (?), it could be removed maybe? What do you think?

lightbody commented 13 years ago

Haha, yeah it's quite possible that code isn't doing anything. Some of the code you'll find is there because it was useful inside of BrowserMob (the commercial product) but doesn't necessarily make sense in an open source proxy. For example, the BrowserMobHttpClient that powers the proxy is the same HTTP client we use for doing basic HTTP load testing.

The goal of prepareForBrowser() was to basically tell HttpClient not to bother with cookies. If you give it a normal cookie store, I was finding that it would sometimes reject cookies it thought were malformed but that a browser might accept. So I'd just make sure that the code you have works with potentially malformed cookies. It's OK if we can't capture every cookie in HAR, but we should definitely can't drop data as we proxy the headers back to the browser.

I think we're OK, but can you just dig in to that area a bit more? If you think the prepareForBrowser() stuff (at least the cookie stuff) can be deleted, feel free to do that too in your pull request.

Last thought: is it possible we don't use HttpClient's cookie store at all and instead just simply parse the HTTP response headers?

gomezd commented 13 years ago

Ok, will dig into that, thanks!

gomezd commented 13 years ago

Hello, i just made a change as you suggested, removing the cookie store and adding a very simple cookie header parser. What i'm not sure about is where should this go? I put it under org.browsermob.proxy.http, but maybe could be under util? Would you mind taking a look and let me know what you think?

SHA: 75b1d2f65cd7cfde3db6bdc0c7b247c71e8e91d8

Thanks

lightbody commented 13 years ago

Looks good. I'm wondering if there might be some code we can borrow from HTTPClient that can do the Cookie parsing for you and handle some of the weird formatting some web servers send out. But this is great for now and we can look in to that if we get complaints about certain cookies not being captured.