Open hawkowl opened 5 years ago
Hey Hawkie! This was pretty concerning at first, since I thought we had a bunch of ipv6 coverage, but now I see, so the problem is actually the to_uri()
part and the newly-integrated idna stuff:
>>> url = URL.from_text(u'https://[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:80/')
>>> url.to_uri()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/hyperlink/_url.py", line 1338, in to_uri
new_host = self.host if not self.host else idna_encode(self.host, uts46=True).decode("ascii")
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 358, in encode
s = alabel(label)
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 270, in alabel
ulabel(label)
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 304, in ulabel
check_label(label)
File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 261, in check_label
raise InvalidCodepoint('Codepoint {0} at position {1} of {2} not allowed'.format(_unot(cp_value), pos+1, repr(label)))
idna.core.InvalidCodepoint: Codepoint U+003A at position 5 of u'2001:0db8:85a3:0000:0000:8a2e:0370:7334' not allowed
So I'm guessing we just need to skip idna-encoding of IP-literal stuff, since it's pretty much guaranteed to be ASCII (some examples). How's that sound?
That's the approach that Twisted's internals use -- check if it's an IP address, idna encode only if it's not.
On Thu., 6 Dec. 2018, 05:46 Mahmoud Hashemi <notifications@github.com wrote:
Hey Hawkie! This was pretty concerning at first, since I thought we had a bunch of ipv6 coverage, but now I see, so the problem is actually the to_uri() part and the newly-integrated idna stuff:
url = URL.from_text(u'https://[2001:0db8:85a3:0000:0000:8a2e:0370:7334]:80/') url.to_uri() Traceback (most recent call last): File "
", line 1, in File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/hyperlink/_url.py", line 1338, in to_uri new_host = self.host if not self.host else idna_encode(self.host, uts46=True).decode("ascii") File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 358, in encode s = alabel(label) File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 270, in alabel ulabel(label) File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 304, in ulabel check_label(label) File "/home/mahmoud/virtualenvs/tmp-d364b3d6b21cd4e4/local/lib/python2.7/site-packages/idna/core.py", line 261, in check_label raise InvalidCodepoint('Codepoint {0} at position {1} of {2} not allowed'.format(_unot(cp_value), pos+1, repr(label))) idna.core.InvalidCodepoint: Codepoint U+003A at position 5 of u'2001:0db8:85a3:0000:0000:8a2e:0370:7334' not allowed So I'm guessing we just need to skip idna-encoding of IP-literal stuff, since it's pretty much guaranteed to be ASCII (some examples http://www.gestioip.net/docu/ipv6_address_examples.html). How's that sound?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/python-hyper/hyperlink/issues/68#issuecomment-444597459, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJ2XGOL8IwRqDSGVgq15IQZVz2ZpnrOks5u2BSUgaJpZM4ZCzCo .