Closed JohnJamesUtley closed 1 year ago
DOTALL
is a sledgehammer. We should use (?s:.*)
for the fragment matching if we actually want to allow newlines there. I'm not convinced we should.
DOTALL
is a sledgehammer. We should use(?s:.*)
for the fragment matching if we actually want to allow newlines there. I'm not convinced we should.
You are absolutely correct that DOTALL is too much and the inline flag is a much cleaner solution. I've switched the pull request to amend that just now.
The change is necessary because in other similar sections of the URL, such as the query and the path, line termination characters are accepted and are converted into percent-encoded characters when the path and query are returned. It is inconsistent that the fragment section treats these same characters differently.
I would expect line termination characters in the fragment to be percent-encoded like they are in the query and the path.
Additionally, the other Python URL parsers handle line termination characters in the fragment section by converting them to percent-encoded characters in their output. Ending the parsing at line termination characters in the fragment section instead of percent-encoding them is strange behavior in my opinion. Especially since line termination characters elsewhere in the URL do not end the parsing. (I tested Furl, Hyperlink, Urllib, and Yarl).
Can you add a test case or two?
@sigmavirus24 Made the changes you requested 👍
fixes #99