ring-clojure / ring-codec

Utility library for encoding and decoding data
MIT License
63 stars 30 forks source link

Documentation could explain the difference between url-encode and form-encode #36

Open timrobinson33 opened 1 year ago

timrobinson33 commented 1 year ago

url-encode and form-encode have very different implementations but seem to do almost the same thing - the only practical difference I have seen is that url-encode does not encode "+" which means it's not suitable for encoding URLs (so I'm not sure what it is useful for!). Note that JavaScript's encodeURIComponent does encode "+"

I'm sure this has caused confusion and possibly coding errors in the past - it would be useful to expand the documentation to explain when you would use one as opposed to the other

weavejester commented 1 year ago

That might not be a bad idea. The docstrings are accurate, but might not be intuitive, largely because most people have a misunderstanding of which characters are allowed in a URL.

As per the docstring, url-encode is for encoding characters so that they may be placed in a URL. This means that + is not encoded, because + is an allowed character and does not need to be encoded. For example:

http://example.com/these+are+pluses

The form-encode function is for encoding characters under the www-form-urlencoded format as used by HTML forms. This does encode + characters, as they have special meaning for this particular format.

http://example.com/?q=these%2Bare%2Bpluses
http://example.com/?q=these+are+spaces
timrobinson33 commented 1 year ago

Thanks for the response.

So only inside query string parameter values (or www-form-urlencoded forms) we should encode + as %2B (i.e. use ring's form-encode) but in other places in the URL we shouldn't encode it (i.e. use ring's url-encode).

In all my 20+ years programming with HTTP I don't think I'd ever really got that before,

It's interesting to note that java's URLEncoder does encode +, and JavasScript has encodeUriComponent which does encode + even though it says in the documentation it's suitable for paths, and encodeUri which doesn't encode + but also doesn't encode /?&. It's not surprising there is so much confusion in this area.