Open rennerocha opened 6 years ago
Hey @rennerocha!
Do you have any other use cases for overriding safe characters?
It looks like in your case a workaround could be to unescape parameters before passing them to add_or_replace_parameter, as this function works with raw, unescaped parameter values - these are values server gets after decoding.
Allowing to override safe characters can be seen as a performance optimization, a hack to avoid unescaping and re-escaping, unless you see other use cases for it. While it may solve a problem, I find the resulting API unintuitive. E.g. does safe
parameter affect only parameters, or does it affect encoding of the original URL as well? What should be its value when you want to pass already-escaped parameters, or just use something as-is?
add_or_replace_parameter uses urlencode function to convert a string to percent-encoded ASCII string: https://github.com/scrapy/w3lib/blob/master/w3lib/url.py#L229
If the value of the parameter is already percent-encoded (in my scenario, this value came from a JSON response of an API), it is not possible to specify safe characters that should not be encoded. This argument is available in urlencode signature: https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlencode
My suggestion is to include a safe argument in add_or_replace_parameter to improve the control over the encoding of your values: Actual: w3lib.url.add_or_replace_parameter (url, name, new_value) New: w3lib.url.add_or_replace_parameter(url, name, new_value, safe='')