aio-libs / yarl

Yet another URL library
https://yarl.aio-libs.org
Apache License 2.0
1.3k stars 160 forks source link

_PATH_QUOTER unquote %3B to ; in url path. Is it correct behavior? #223

Closed amarynets closed 4 years ago

amarynets commented 6 years ago

I have this URL: https://www.madewell.com/9%22-high-rise-skinny-jeans-in-isko-stay-blacktrade%3B-G1202.html which contains %3B(;).

YARL unquote %3B to ; https://www.madewell.com/9%22-high-rise-skinny-jeans-in-isko-stay-blacktrade%3B-G1202.html https://www.madewell.com/9%22-high-rise-skinny-jeans-in-isko-stay-blacktrade;-G1202.html

When I open URL from YARL it redirects me to the homepage. But If I make little changes: _PATH_QUOTER = _Quoter(safe='@:', protected='/+;')- add ; to protected parameter - it works correct

My question: Is it issue with YARL or with a site? Thanks

amarynets commented 6 years ago

I tried to do it with requests lib and it doesn't unquote %3B

LinnTroll commented 6 years ago

same problem with "&" symbol. Example url: https://www.pier1.com/christopher-metallic-jacquard-indigo-%26-gold-pillow/3498618.html URL._PATH_QUOTER convert "%26" to "&" symbol, and make bad url https://www.pier1.com/christopher-metallic-jacquard-indigo-&-gold-pillow/3498618.html

amarynets commented 6 years ago

Also the same issue with comma: https://eu.stuartweitzman.com/en/shoes/sandals/the-partilow-sandal--black---limited-availability%2C-style-will-not-be-restocked-PARTILOWCALBLA.html

amarynets commented 6 years ago

The issue was solved by this code URL(url, encoded=True) In aiohttp you can pass URL object or string. The string was already encoded

beyondwxin commented 6 years ago
File "/home/blh/.local/lib/python3.5/site-packages/homeassistant/components/frontend/__init__.py", line 8, in <module>
    from aiohttp import web
  File "/home/blh/.local/lib/python3.5/site-packages/aiohttp/web.py", line 14, in <module>
    from . import (hdrs, web_exceptions, web_fileresponse, web_middlewares,
  File "/home/blh/.local/lib/python3.5/site-packages/aiohttp/web_middlewares.py", line 5, in <module>
    from aiohttp.web_urldispatcher import SystemRoute
  File "/home/blh/.local/lib/python3.5/site-packages/aiohttp/web_urldispatcher.py", line 17, in <module>
    from yarl import URL, unquote
ImportError: cannot import name 'unquote'

please why it is?

amarynets commented 6 years ago

Because YARL doesn't have unquote function. It has _Quoter and _Unquoter classes, but you shouldn't use it @beyondwxin

webknjaz commented 6 years ago

@beyondwxin from your trace aiohttp imports this function, you probably have incompatible versions of yarl and aiohttp. Try upgrading aiohttp.

beyondwxin commented 6 years ago

@beyondwxin from your trace aiohttp imports this function, you probably have incompatible versions of yarl and aiohttp. Try upgrading aiohttp.

I used a newest the aiohttp and yarl version

asvetlov commented 6 years ago

Please check versions explicitly:

import aiohttp
print(aiohttp.__version__)
import yarl
print(yarl.__version__)
asvetlov commented 4 years ago

Stale