Semi-colon separators are tossed aside within query strings

gabrielfalcao / HTTPretty

Intercept HTTP requests at the Python socket level. Fakes the whole socket module

MIT License

2.11k stars 276 forks source link

Under the hood, httpretty is using urlparse.parse_qs to parse query strings.

This poses a problem when a semi-colon is used as a delimiter:

In [1]: import urlparse

In [2]: urlparse.parse_qs("tagged=python;ruby&site=stackoverflow")
Out[2]: {'site': ['stackoverflow'], 'tagged': ['python']}

This stems from semi-colons being hardcoded as a delimiter within urlparse like '&' (a W3C recommendation but not standard for webservers).

If the semi-colon is escaped however, this works just fine.

In [11]: urlparse.parse_qs("tagged=python%3Bruby&site=stackoverflow")
Out[11]: {'site': ['stackoverflow'], 'tagged': ['python;ruby']}

In reality though, unquote_utf8 will prevent this from passing through.

This all stems from trying to mock and test a module that is using the StackExchange API, which uses semi-colons as separators.

Test case using httpretty:

import requests
import httpretty

httpretty.enable()

httpretty.register_uri(httpretty.GET, "https://api.stackexchange.com/2.1/search", body='{"items":[]}')

resp = requests.get("https://api.stackexchange.com/2.1/search",
                    params={"tagged":"python;ruby",
                            "site": "stackoverflow"})

httpretty_request = httpretty.last_request()
print(httpretty_request.querystring)

httpretty.disable()
httpretty.reset()

Relevant issue on Python's own issue tracker.

# # To get around how parse_qs works (urlparse, under the hood of # httpretty), we'll leave the semi colon quoted. # # See https://github.com/gabrielfalcao/HTTPretty/issues/134 orig_unquote = httpretty.core.unquote_utf8 httpretty.core.unquote_utf8 = (lambda x: x) # It should handle tags as a list httpretty.register_uri(httpretty.GET, "https://api.stackexchange.com/2.1/search", body=param_check_callback({'tagged': 'python;dog'})) search_questions(since=since, tags=["python", "dog"], site="pets") ... # Back to normal for the rest httpretty.core.unquote_utf8 = orig_unquote # Test the test by making sure this is back to normal assert httpretty.core.unquote_utf8("%3B") == ";"

gabrielfalcao / HTTPretty

Semi-colon separators are tossed aside within query strings #134