Open mlhdeveloper opened 2 weeks ago
I think the solution lies in changing this part so that it's only checking for ://
near the beginning of the url
and not anywhere in the entire url
:
https://github.com/splunk/utbox/blob/f8db838d28117f15fc406b6fc980d2963776ab37/utbox/bin/ut_parse_lib.py#L15
https://github.com/splunk/utbox/blob/f8db838d28117f15fc406b6fc980d2963776ab37/utbox/bin/ut_parse_lib.py#L257-L258
I think this fixes it so that it's only checking for ://
at the very beginning of the url
or only after a scheme, i.e. only after any number of alphabetical or +
characters (based on schemes handled by urllib.parse):
preg_rfc1808 = re.compile("^[a-z+]*://")
Here's a simple example URL that fails to parse:
If you add a scheme to the front, it then parses properly: