lambdaisland / uri

A pure Clojure/ClojureScript URI library
Mozilla Public License 2.0
243 stars 21 forks source link

`lambdaisland.uri/uri-str` does not URL-encode username and password #53

Open devurandom opened 5 months ago

devurandom commented 5 months ago

lambdaisland.uri/uri-str does not URL-encode username and password:

(->URI "http" "fan:cy" "s@cr3d!" "example.com" nil nil nil nil)
;=> #lambdaisland/uri"http://fan:cy:s@cr3d!@example.com"

I would expect lambdaisland.uri/URI it to be able to round-trip its input data, but it can't:

(->> (map->URI {:scheme   "http"
                :user     "fan:cy"
                :password "s@cr3d!" 
                :host     "example.com"})
     uri-str
     uri
     (select-keys [:scheme :user :password :host]))

;=>
;{:scheme "http",
; :user "fan",
; :password "cy:s@cr3d!",
; :host "example.com"}
plexus commented 5 months ago

Good catch! That should indeed be fixed. A PR would be appreciated!

plexus commented 5 months ago

Having a better look at this, all fields of the URI record are expected to be valid, i.e. not contain characters not allowed in that part of the URI, which means they should be percent encoded before creating the record. We do have normalize for encoding them after the fact, so that's really the only place where it makes sense to add/change behavior. uri+uri-str are only expected to round trip if the URI is valid, i.e. normalized.

@devurandom are you interested in creating a patch for this?

devurandom commented 5 months ago

are you interested in creating a patch for this?

Yes, but I cannot commit to a timeline at the moment.

all fields of the URI record are expected to be valid, i.e. not contain characters not allowed in that part of the URI, which means they should be percent encoded before creating the record

I am using lambdaisland.uri/parse to parse a URI string and extract information. Do I understand you correctly that I should not expect the fields of URI records to contain the "raw" values after parsing? I.e. do I have to manually decode the URI fields to get the actual username and password that I need to send to the server for authentication? (I assume the same applies to paths, query strings and fragments?)

plexus commented 5 months ago
(:path (uri/parse "http://example.com/sch%C3%B6n"))
;; => "/sch%C3%B6n"

I would expect the same behavior for username/password, if you need the non-percent encoded version you decode it yourself.

Perhaps a debatable design choice, but it's how it works now for all other fields, so for consistency I would keep it that way.