aio-libs / yarl

Yet another URL library
https://yarl.aio-libs.org
Apache License 2.0
1.34k stars 168 forks source link

Comma not being encoded in GET request params #210

Open gusutabopb opened 6 years ago

gusutabopb commented 6 years ago

I am trying use aiohttp to send a GET request to the following URL: https://api-fxtrade.oanda.com/v1/prices?instruments=EUR_USD%2CUSD_JPY. However, I keep getting errors saying my request has a malformed query string:

url = "https://api-fxtrade.oanda.com/v1/prices"
params = {'instruments': 'EUR_USD,USD_JPY'}

async with ClientSession() as s:
    async with s.get(url, params=params, headers=headers) as r:
        print(r.status, r.url)
        # 400 https://api-fxtrade.oanda.com/v1/prices?instruments=EUR_USD,USD_JPY

Replacing params with {'instruments': 'EUR_USD%2CUSD_JPY'} doesn't work either. (It seems the URL-encoded string is decoded and never encoded again).

However, the equivalent code in requests works just fine:

r = requests.get(url, headers=headers, params={'instruments': 'EUR_USD,USD_JPY'})
print(r.status_code, r.url)
# 200 https://api-fxtrade.oanda.com/v1/prices?instruments=EUR_USD%2CUSD_JPY

The workaround I found was to not use params at all (as mentioned in aiohttp docs):

from urllib.parse import urlencode
from yarl import URL

url = URL('?'.join([url, urlencode(params)]), encoded=True)

async with ClientSession() as s:
    async with s.get(url, headers=headers) as r:
        print(r.status, r.url)
        # 200 https://api-fxtrade.oanda.com/v1/prices?instruments=EUR_USD%2CUSD_JPY

Should this be thought as a bug or is this expected behavior?

Related: https://github.com/requests/requests/issues/794#issuecomment-7852135

webknjaz commented 6 years ago

If you navigate to https://api-fxtrade.oanda.com/v1/prices?instruments=EUR_USD,USD_JPY in Google Chrome it doesn't encode comma either. So I'd assume that you should tell fxtrade guys to fix their server :)

webknjaz commented 6 years ago

Still, it looks like yarl should have a way to preserve original encoding.

behnam commented 6 years ago

Parsing query values from URL string is a different task from setting a query value via an API call. At the moment, yarl only exposes the parsing functionality, via with_query(), which is what I assume aiohttp is using for setting params.

Here's a demo of the problem:

In [120]: url1 = yarl.URL('https://google.com/search').with_query({'q': 'سلام'})

In [121]: url1_str = str(url1)

In [122]: url1_str
Out[122]: 'https://google.com/search?q=%D8%B3%D9%84%D8%A7%D9%85'

In [123]: url2 = yarl.URL('https://example.com/something').with_query({'p': url1_str})

In [124]: url2.query['p']
Out[124]: 'https://google.com/search?q=سلام'

In [125]: url2.query['p'] == url1_str
Out[125]: False

Looks like a similar problem exist also for getting a query value, vs dumping it into URL string.

Right now I'm working on adding tests for these and fixing them with methods to set and get query values without over-encoding/over-decoding.

Since changing the behavior of the current methods would be breaking the API, need to do this via couple of new methods.

bsolomon1124 commented 5 years ago

Related: https://github.com/aio-libs/yarl/issues/245, https://github.com/aio-libs/yarl/issues/301

asvetlov commented 4 years ago

https://api-fxtrade.oanda.com/v1/prices?instruments=EUR_USD,USD_JPY is a valid URL. RFC 3986:

    sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                / "*" / "+" / "," / ";" / "="
    pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
    query       = *( pchar / "/" / "?" )
    fragment    = *( pchar / "/" / "?" )

The comma is a valid sub-delim which don't have to be encoded in query part. The statement that the URL is malformed is invalid.

You may want PCT-encode even allowed character though.