webmetrics / browsermob-proxy

NOTICE: this project has been forked and is being maintained at https://github.com/lightbody/browsermob-proxy
https://github.com/lightbody/browsermob-proxy
Apache License 2.0
234 stars 773 forks source link

Query string is URL decoded twice #81

Closed Hellspam closed 11 years ago

Hellspam commented 11 years ago

Hi, I've been using browsermob proxy when running selenium tests. One thing I've been testing is that our client side code in JS sends requests using HTTP GET encoded as JSON inside a query string. An issue I've had with this in browsermob proxy is that when sending JSON data using a query string, the URL is decoded twice, which causes some errors in testing. For example, when running GET with query string like this:

?data="{"b", "http://site/d?a&b"}"

Then getting the HarNameValuePair with name = "data" will return this string:

"{"b", "http://site/d?a

I'm pretty sure the issue has to do with URL decoding the string twice - which causes the & inside the JSON to be thought of as another query param.

It is easily fixable (or at least was easily fixable for me, not sure if it could cause any more problems) by changing line 503 in BrowserMobHttpClient.java from:

String query = method.getURI().getQuery();

To:

String query = method.getURI().getRawQuery();
lightbody commented 11 years ago

Thanks for the bug report. Your change may very well be the right change, but I'm not 100% certain the test case/example you provided is the one I should be basing this around. Isn't the GET string invalid?

?data="{"b", "http://site/d?a&b"}"

For it to be a valid URL, does it have to be something like:

?data=%22{%22b%22,+%22http://site/d%3Fa%26b%22}%22

That is, the space, ampersand, and quotes are not really valid query parameters in the first place?

Hellspam commented 11 years ago

Hi, yes of course it should be URL encoded. I just copied something from one of our test files before it is passed to selenium.

Hellspam commented 11 years ago

An even simpler test case would be

?a=1%262

Which is actually

?a=1&2

Which should appear in the HARLog as

"a" = 1&2

And not

"a"= "1" , "2" = ""
lightbody commented 11 years ago

Would you at all be interested in building a test case? It's relatively easy to copy the existing tests here:

https://github.com/webmetrics/browsermob-proxy/blob/master/src/test/java/org/browsermob/proxy/MailingListIssuesTest.java

On Jan 23, 2013, at 8:01 AM, Hellspam notifications@github.com wrote:

An even simpler test case would be

?a=1%262 Which is actually

?a=1&2 Which should decode to

"a" = 1&2 And not

"a"= 1 , 2 = "" — Reply to this email directly or view it on GitHub.

Hellspam commented 11 years ago

Sorry, don't get why the the diff is so big.

OliaChe commented 9 years ago

Hi, I've been using browsermob proxy for running selenium testscripts, and face the same issue. e.g. (actual/expected) "user profile|my account|payments" != "user profile|my account|payments & credits" (Please see screenshot attached) har.getLog().getEntries().get(i) method returns part of response which is before ampersand, it cuts off the rest part of response .

Could you please provide any ways to workaround this limitation?

Environment: Chrome version 39.0 Browsermob version 2.0. beta 6 Selenium webdriver version 2.43

This is a screenshoot of DigitalPulse Debugger to see what actually is in the page http://c2n.me/38tjbOo