lambdaisland / uri

A pure Clojure/ClojureScript URI library
Mozilla Public License 2.0
243 stars 21 forks source link

Pitch: Merging URIs #12

Closed DerGuteMoritz closed 4 years ago

DerGuteMoritz commented 4 years ago

Problem

I often want to construct URIs by building on a given base URI. For example I would have https://my.service/some-context/ as a base and would then derive specific URIs like https://my.service/some-context/some-resource from that. This is currently not easily possible for arbitrary combinations of partial URIs.

Background

We already have uri/join which at first glance seems to do the trick. Using the example above:

(str (uri/join "https://my.service/some-context/" "some-resource"))
=> "https://my.service/some-context/some-resource"

However, it doesn't work with all URI components for this use-case. For example, if the URI that's being joined consists only of a path, its (non-existent) query wins over that of the base URI:

(str (uri/join "https://my.service/some-context/?api_key=123" "some-resource"))
=> "https://my.service/some-context/some-resource"

This is because uri/join implements the reference resolution process specified in RFC3986, section 5. This process is used e.g. when following links in HTML pages, where this behavior totally makes sense, of course. It doesn't quite fit the described use-case, though.

Here's a helper function which one could define with the current API to achieve the desired outcome:

(defn uri-merge [& uris]
  (let [uri-maps (->> uris (map uri/uri) (map #(into {} (filter (comp some? val)) %)))
        merged-path (some->> uris (keep :path) seq (apply uri/join) :path)]
    (-> (apply merge uri-maps)
        (assoc :path merged-path)
        uri/map->URI)))

(str (uri-merge (uri/uri "https://my.service/some-context/?api_key=123") (uri/uri "some-resource")))
=> "https://my.service/some-context/some-resource?api_key=123"

So while it's certainly possible, it's quite cumbersome (thus the "not easily" qualifier in the problem statement). The pitch is based on the assumption that this function is common enough to include it in the API proper.

Solution

I suggest to introduce a new function (uri/merge?) which allows merging of (partial) URIs with similar semantics as clojure.core/merge while also joining paths. Some way to control whether the reference resolution process is used (i.e. /foo joined with bar becomes /bar) or whether to always append relative paths (i.e. /foo joined with bar becomes /foo/bar) would be handy, too.

Rabbit Holes

Supporting resetting of components to nil via the merge function seems like a rabbit hole that should probably be avoided: Since URI is a record type, absence of values is encoded as nil, so to achieve the desired behavior of retaining components of URIs to the left of URIs to the right which don't contain a value for that component requires skipping nil values.

No-goes

Merging / joining of components other than the path.

Resources / Links

https://tools.ietf.org/html/rfc3986#section-5

plexus commented 4 years ago

Problem

You're not describing a problem, so I can't comment on your proposed solution.

DerGuteMoritz commented 4 years ago

How about now?

DerGuteMoritz commented 4 years ago

Elaborated on the "not easily" a bit in the background section. Not familiar with this format, yet, so bear with me :smile:

plexus commented 4 years ago

I guess I still don't understand what the use case is. That's why I'm hammering on the problem specification. I feel like there is almost certainly a simpler way to solve the specific problem you are having.

From what I gather what you want to do is join path segments onto a base URI, while also preserving the query parameters. Then why not do just that?

(-> (uri/uri "https://my.service/some-context/?api_key=123")
    (update :path str "some-resource"))

Or if it's an API key you need on every request (can't this be a header?) then have a helper that does just that?

(def endpoint "https://my.service/some-context/")
(def api-key "123")

(defn api-uri [& segments]
  (-> (apply uri/join endpoint segments)
      (assoc :query (str "api-key=123"))))

(api-uri endpoint "some-resource")
;; => "https://my.service/some-context/some-resource?api-key=123"

The operation you are proposing seems like a generalization of that, but I don't see the point of generalizing this. What does it mean to merge the port or the fragment of the authentication from one uri with another?

(str (uri-merge "http://localhost:3303/hello-world"
                "https://en.wikipedia.org/wiki/VT100#Variants"))
;; => "https://en.wikipedia.org:3303#Variants"

It's also not clear to me to what extend you do or don't want the semantics of uri join (i.e. dealing with absolute/relative paths), or if you really just need a (clojure.string/join "/" ...).

Something to keep in mind is that typically query parameters are specific to a certain URL, the parameters for https://google.com are not the same as the ones for https://google.com/mail. Cases like the above where do you want that are I believe the exception, and so it's fine to make that fairly explicit in your code.