Deepwalker / trafaret

Ultimate transformation library that supports validation, contexts and aiohttp.
http://trafaret.readthedocs.org/en/latest/
BSD 2-Clause "Simplified" License
177 stars 31 forks source link

Valid url do not fit trafaret.URL #52

Open FedirAlifirenko opened 5 years ago

FedirAlifirenko commented 5 years ago

Hi @Deepwalker . I found the next validation error for working url 'https://www.dior.com/fr_fr/maquillage/adoptez-le-look-du-defile-croisiere\xa02020' Is it expected behavior? What do you think?

(3_7_2) MacBook-Pro-2:test fedir$ python -c "import trafaret as t; t.URL.check('https://www.dior.com/fr_fr/maquillage/adoptez-le-look-du-defile-croisiere\xa02020')"
Traceback (most recent call last):
  File "/Users/fedir/env/3_7_2/lib/python3.7/site-packages/trafaret/base.py", line 166, in transform
    return self.trafaret(value, context=context)
  File "/Users/fedir/env/3_7_2/lib/python3.7/site-packages/trafaret/base.py", line 156, in __call__
    return self.check(val, context=context)
  File "/Users/fedir/env/3_7_2/lib/python3.7/site-packages/trafaret/base.py", line 118, in check
    return self.transform(value, context=context)
  File "/Users/fedir/env/3_7_2/lib/python3.7/site-packages/trafaret/base.py", line 286, in transform
    raise DataError(dict(enumerate(errors)), trafaret=self)
trafaret.dataerror.DataError: {0: DataError(does not match pattern ^(?:http|ftp)s?://(?:\S+(?::\S*)?@)?(?:(?:[A-Z0-9](?:[A-Z0-9-_]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|localhost|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})(?::\d+)?(?:/?|[/?]\S+)$), 1: DataError(does not match pattern ^(?:http|ftp)s?://(?:\S+(?::\S*)?@)?(?:(?:[A-Z0-9](?:[A-Z0-9-_]{0,61}[A-Z0-9])?\.)+(?:[A-Z]{2,6}\.?|[A-Z0-9-]{2,}\.?)|localhost|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})(?::\d+)?(?:/?|[/?]\S+)$)}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/fedir/env/3_7_2/lib/python3.7/site-packages/trafaret/base.py", line 118, in check
    return self.transform(value, context=context)
  File "/Users/fedir/env/3_7_2/lib/python3.7/site-packages/trafaret/base.py", line 168, in transform
    raise DataError(self.message, value=value)
trafaret.dataerror.DataError: value is not URL
Deepwalker commented 5 years ago

Hi! Can you check if this link works with this pr? https://github.com/Deepwalker/trafaret/pull/36

Deepwalker commented 5 years ago

Actually I'm not sure about right behavior. It's can be that link is actually incorrect. Will need to reread rfc

asvetlov commented 5 years ago

Proof of URL correctness is hard. I suspect that regex-based solution is supposed to provide false positives by design :( Even much more complicated yarl is not free from such things. Well, yarl.URL() works pretty good but yarl.URL.build() cannot parse valid args now :(

FedirAlifirenko commented 5 years ago

@asvetlov

but yarl.URL.build() cannot parse valid args now

What are you mean ? It seems, everything works:

fedor@ubuntu:~$ python -c "import yarl; print(yarl.URL.build(host='example.com', scheme='https', path='/path\xa0abc'))"
https://example.com/path%C2%A0abc
fedor@ubuntu:~$ python -V
Python 3.7.3
fedor@ubuntu:~$ pip list | grep yarl
yarl       1.3.0