spdx / tools-python

A Python library to parse, validate and create SPDX documents.
http://spdx.org
Apache License 2.0
184 stars 133 forks source link

Why use uritools instead of the standard library urllib? #803

Open pombredanne opened 7 months ago

pombredanne commented 7 months ago

Why use uritools instead of the standard library urllib? Are there specific cases that really demand adding an extra dependency? It seems barely used https://github.com/search?q=repo%3Aspdx%2Ftools-python+uritools+language%3APython&type=code&l=Python and it would be best to avoid adding an extra dependency for such a small benefit.

jspeed-meyers commented 5 months ago

I am also supportive of removing a dependency that provides little benefit. I tried to replace uritools functions with urllib equivalents and hit some issues though.

While debugging I noticed this:

https://github.com/spdx/tools-python/blob/552940abb52cbc9cf358e429d1388e9c1b25d7a6/tests/spdx/validation/test_creation_info_validator.py#L31-L36

And then further noticed that the Python standard libary urllib module says:

RFC 3986 is considered the current standard and any future changes to
urlparse module should conform with it.  The urlparse module is
currently not entirely compliant with this RFC due to defacto
scenarios for parsing, and for backward compatibility purposes, some
parsing quirks from older RFCs are retained. The testcases in
test_urlparse.py provides a good indicator of parsing behavior.

I wonder if urllib not being compliant with RFC 3986 is why uritools is used instead of urllib? Maybe the need for compliance with RFC 3986 is strong enough to merit adding a dependency on uritools? I personally don't know. But when I tried some simple replacements, it didn't go well :)