Open weavejester opened 4 years ago
:tada:
Did you mean :ring.request/body
-> :ring.response/body
etc in the table?
core.async
protocols to async responses?ring.request
keys and ring.push
keys has identical specs. Why not reuse ring.request
in push maps
?clojure.core
predicates in "types"? I mean, use string?
where was String
Why not use
core.async
protocols to async responses?
It adds a dependency on core.async.
ring.request
keys andring.push
keys has identical specs. Why not reusering.request
in push maps?
So that we can distinguish between a push map and a request map. A push map also has different mandatory keys to a request map.
Let's use
clojure.core
predicates in "types"? I mean, usestring?
where wasString
That doesn't work for protocols, and I'd rather be precise and use a type over a predicate. There are some predicates that don't map 1-to-1 with types.
Did you mean
:ring.request/body
->:ring.response/body
etc in the table?
Yes, good catch. Let me fix that.
Why not use
core.async
protocols to async responses?It adds a dependency on core.async.
So can we consider a new protocol for async API? Then we can easily bridge any async lib into ring protocol.
ring.request
keys andring.push
keys has identical specs. Why not reusering.request
in push maps?So that we can distinguish between a push map and a request map. A push map also has different mandatory keys to a request map.
Both clojure.spec
and plumatic/schema
has capabilities to say with keys are required in different contexts. In clojure.spec
example:
(s/def ::request (s/keys :req [:ring.request/method]
:opt [:ring.request/path]))
(s/def ::push (s/keys :req [:ring.request/path]
:opt [:ring.request/method]))
Let's use
clojure.core
predicates in "types"? I mean, usestring?
where wasString
That doesn't work for protocols, and I'd rather be precise and use a type over a predicate. There are some predicates that don't map 1-to-1 with types.
ring can define ring.response/streamable-response-body?
next to the protocol.
Things like Integer
don't make sense in some targets like clojurescript/JS
, I think that :ring.response/status
for example, can be a number?
So can we consider a new protocol for async API? Then we can easily bridge any async lib into ring protocol.
Can you explain what you mean by "for async API"? Do you mean adding a protocol for non-blocking I/O on the request and response bodies?
Both clojure.spec and plumatic/schema has capabilities to say with keys are required in different contexts.
Sure, but that doesn't help if we want to differentiate between a push promise and a HTTP request dynamically. A push promise also has different semantic meaning to a HTTP request, even though they share fields of the same type.
Things like Integer don't make sense in some targets like
clojurescript/JS
, I think that:ring.response/status
for example, can be anumber?
The StreamableResponseBody
protocol doesn't make sense for ClojureScript either. The specification is designed to target Clojure, rather than ClojureCLR or ClojureScript.
I have thought about what we'd do to adapt the specification to ClojureScript, and it goes a little deeper than just using predicates rather than type names.
How about async streaming API - if my handler wants to send chunks?
We use a lot a context map (with bunch of deps like db connection etc) as a first parameter to handlers - may it be useful for ring?
The spec obviously spends time describing the differences between the synchronous and asynchronous protocols. One thing it doesn’t seem to mention is the serialization of requests. For instance, can a server have multiple synchronous requests in flight at the same time, or does synchronous imply that requests are serialized, one after another? In particular, if a synchronous handler blocks, is it blocking all requests, or just the request that it’s servicing? Perhaps the omission is on purpose to allow for different server adapter implementations, but the silence is noticeable. If it’s purposeful, perhaps a statement to that effect would be good to include.
How about async streaming API - if my handler wants to send chunks?
That was added back in Ring 1.6, with the caveat that I/O is blocking. In other words, while your handler is actively writing data to the client it's blocking the thread.
The problem with fully supporting non-blocking I/O in Ring 1 was that the request body was an InputStream
, which is a blocking stream. In Ring 2, that's been changed, and in Ring 2.1 the plan is to have NIO protocols that the request body and response body can optionally satisfy.
For Ring 2.0 I thought that sort of performance optimisation was a little out of scope, as it's a fairly niche requirement.
We use a lot a context map (with bunch of deps like db connection etc) as a first parameter to handlers - may it be useful for ring?
In general I think it's generally better to use closures for that:
(fn [context]
(fn [request]
...))
But I'd be interested in hearing about use-cases where closures are not as elegant. That said, I don't anticipate changing the request/response maps into context maps instead.
For instance, can a server have multiple synchronous requests in flight at the same time, or does synchronous imply that requests are serialized, one after another? In particular, if a synchronous handler blocks, is it blocking all requests, or just the request that it’s servicing?
No, it's just blocking the thread for the current request. It's expected that an adapter that is running a synchronous handler will have a threadpool for handling multiple requests concurrently.
For instance, can a server have multiple synchronous requests in flight at the same time, or does synchronous imply that requests are serialized, one after another? In particular, if a synchronous handler blocks, is it blocking all requests, or just the request that it’s servicing?
No, it's just blocking the thread for the current request. It's expected that an adapter that is running a synchronous handler will have a threadpool for handling multiple requests concurrently.
OK. My suggestion would be to make that explicit so that nobody has to guess. I assume that the size of the thread pool is really in the domain of the server adapter and shouldn't be part of the Ring spec, but there's a big difference between one (serial) and more than one (non-serial). I'm sure that Ring 1.x behavior already works this way, but given the chance to document it explicitly, I would take the opportunity.
This does make accessing request headers with single values slightly more laborious: (first (get-in request [:headers "content-type"]))
If you are already using get-in
, it’s still simple:
(get-in request [:headers "content-type" 0])
The spec does a great job of considering HTTP/2 issues. Has it been checked against HTTP/3 proposals? I realize that's a moving target and still in flux, but it seems like 2020 is the year that HTTP/3 will start to happen. It's already included in stable versions of Chrome and Firefox, though not the default, per https://en.wikipedia.org/wiki/HTTP/3.
Great spec, all changes very solid and timely!
BTW, I should also add, great job on Ring 2 and in documenting the rationale behind the decisions. It was a pleasure to read through these documents and I can't wait to use it in anger. Ring is one of the most important APIs in the Clojure ecosystem and you've done an amazing job both with Ring 1 and the evolution to Ring 2, James. Thanks for all your effort over the years.
By contrast, Ring 2 mandates that response header names be lowercase, and the values be vectors:
Is it required for the spec implementor (a web server) to validate this?
An key benefit to being able to identify requests and responses is that we can retain backward compatibility across middleware and utility functions. We can efficiently distinguish between requests that adhere to the 1.x specification, and those that adhere to the 2.x specification, and invoke different behavior depending on the type of input.
I can't help looking at such backward compatibility from the performance perspective. Would this mean that a spec implementation must expose both sets of keys to the middleware/handlers? From a brief glimpse, it feels like this can bring a noticeable overhead from growing the request map, allocating extra objects for the header values (the smallest vector is 240 bytes, btw), and so on.
I would argue that websockets should not be supported until NIO has landed. WS is one of the strongest use cases for NIO, as a lot of applications have lots of mostly idle connections. Without NIO this is pretty inefficient due to the overheads in managing those connections.
My suggestion would be to make that explicit so that nobody has to guess. I assume that the size of the thread pool is really in the domain of the server adapter and shouldn't be part of the Ring spec, but there's a big difference between one (serial) and more than one (non-serial).
I'll consider adding a note explaining the terminology.
Has it been checked against HTTP/3 proposals?
My understanding is that the main difference between HTTP/2 and HTTP/3 is the introduction of QUIC as a transport protocol, which doesn't affect Ring.
By contrast, Ring 2 mandates that response header names be lowercase, and the values be vectors:
Is it required for the spec implementor (a web server) to validate this?
No, because by the time it gets to the adapter it's essentially too late. The idea is to make it simpler to write middleware that reads response headers.
The adapter will probably throw an exception if it sees a string where it expects a vector, but it's not the adapter's responsibility to check the header strings are lowercase. Unless it really wants to.
I can't help looking at such backward compatibility from the performance perspective. Would this mean that a spec implementation must expose both sets of keys to the middleware/handlers?
No; an adapter would be started in Ring 1 or Ring 2 mode. Middleware would be able to handle both types, in the same way that middleware can currently handle asynchronous or synchronous handlers.
CPU performance impact is likely to be small; around 6ns added to each request per middleware, so we're talking under 100ns. This may be further reduced by CPU branch prediction, as the same condition branch will be hit each time.
From a brief glimpse, it feels like this can bring a noticeable overhead from growing the request map, allocating extra objects for the header values (the smallest vector is 240 bytes, btw), and so on.
In terms of larger request maps, yes this is true. However, it's far quicker to perform a lookup on a vector than to parse a string, so we'll save some CPU cycles in exchange.
I would argue that websockets should not be supported until NIO has landed. WS is one of the strongest use cases for NIO, as a lot of applications have lots of mostly idle connections. Without NIO this is pretty inefficient due to the overheads in managing those connections.
Websockets won't be affected by this. I'm delaying NIO specifically for reading request bodies and writing response bodies. A websocket will not take up a thread while waiting for data.
BTW, I should also add, great job on Ring 2 and in documenting the rationale behind the decisions. It was a pleasure to read through these documents and I can't wait to use it in anger.
Thanks! I'm going to try my best to get a complete alpha out in the next two months.
If you are already using get-in, it’s still simple:
(get-in request [:headers "content-type" 0])
Yes, that's true.
I have a question, according to the ring spec, ring does nothing about how to create the thread that handle the request. And almost every web server library(except aleph) does not depends on async library neither.
So is it possible to use threads in core.async thread pool to handle the request?
@weavejester you're right, I was not paying enough attention.
I am however a little concerned about the send-message api. Is it sync or async, what happens if you have a slow consumer, what's the backpressure story?
So is it possible to use threads in core.async thread pool to handle the request?
It's possible, but that's up to the adapter.
How about async streaming API - if my handler wants to send chunks?
That was added back in Ring 1.6, with the caveat that I/O is blocking. In other words, while your handler is actively writing data to the client it's blocking the thread.
The problem with fully supporting non-blocking I/O in Ring 1 was that the request body was an
InputStream
, which is a blocking stream. In Ring 2, that's been changed, and in Ring 2.1 the plan is to have NIO protocols that the request body and response body can optionally satisfy.For Ring 2.0 I thought that sort of performance optimisation was a little out of scope, as it's a fairly niche requirement.
We use a lot a context map (with bunch of deps like db connection etc) as a first parameter to handlers - may it be useful for ring?
In general I think it's generally better to use closures for that:
(fn [context] (fn [request] ...))
But I'd be interested in hearing about use-cases where closures are not as elegant. That said, I don't anticipate changing the request/response maps into context maps instead.
Closures cons:
Most handlers and mw should be wrapped this way
The idea of external chain executer instead of wrapping functions as well worth discussion. What do you think in general about Pedestal interfaces?
I am however a little concerned about the send-message api. Is it sync or async, what happens if you have a slow consumer, what's the backpressure story?
Those are excellent points. Ring is designed to try to be adapted to the widest range of libraries, which often means considering the lowest common denominator. In this case the design is a synchronous send, and backpressure and buffering concerns handled by the adapter.
However, your comment has made me reconsider the design. Though not every existing Java and Clojure websocket implementation supports asynchronous sending, the majority of them do, and the key advantage of an asynchronous send is that developers can implement bespoke backpressure handling.
I'm considering adding an optional protocol like:
(defprotocol AsyncSocket
(async-send [socket message callback]))
Or possibly just extending the current protocol. This will require a little hammock time, so feel free to suggest designs or raise further concerns in the meantime.
Closures cons:
- unaesthetic stack-trace;
Can you give an example?
- Implicit state - not so easy to test or change on fly.
I'm not sure what you mean by this. Why does adding another lexical scope make things more implicit or harder to test?
- Providing request as part of context can simplify response middleware.
How so?
- Isn't the first-order function always better than high-order? :)
Why?
- Most handlers and mw should be wrapped this way
I can see handlers often requiring context, such as a database connection, but I've rarely needed that for middleware outside of authorization middleware. Could you give some examples of other middleware that would require a context?
It's possible, but that's up to the adapter.
If a web server is written to support, for example, core.async
. Should it be desgined to let the caller provide handler like (fn [request respond raise])
? I think it will be more natural to support use async channel as a respone.
Looks like the famous Clojure async libraries(I mean core.async
, manifold
) have their stuff that represent async channel/stream. How about make the response extensible instead of this 3 arity function way.
context
Just merge your context map on request. once the request use just qualified keys on ring namespace, there is no chance of collisions.
Looks like the famous Clojure async libraries(I mean
core.async
,manifold
) have their stuff that represent async channel/stream. How about make the response extensible instead of this 3 arity function way.
The 3-arity functions were introduced back in Ring 1.6, so that part isn't a new change. The advantages of the 3-arity design are:
For example, suppose you have a handler that returns a core.async promise channel. You could easily convert that into a Ring handler:
(defn core-async->ring [chan-handler]
(fn [request respond raise]
(core.async/take! (chan-handler request) respond)))
Or suppose you have a handler that returns a Manifold deferred:
(defn manifold->ring [deferred-handler]
(fn [request respond raise]
(manifold.deferred/on-realized (deferred-handler request) respond raise)))
Ring doesn't support core.async directly because it's the bottom layer of the Clojure web stack. The goal of Ring is not to provide something that's necessarily easy for end developers to use, but to provide a simple and common foundation that tooling authors can build upon.
Or to put it another way: you can build "easy" on top of "simple", but it's hard to do the reverse.
Just merge your context map on request. once the request use just qualified keys on ring namespace, there is no chance of collisions.
This is also a possibility.
Following thoughts may be wrong.
I think there are two cases.
If the adapter not use async library directly as the adapter want to talk to ring spec. The user can integrate them in a way like core-async->ring
, or put the this part in a middleware. In this case, assuming the adapter has its own way to support async handler, it should have a threadpool. So when using handler or middleware as a bridge, the result will be there are 2 threadpools. I think this is an obviously resource cost.
If the adapter use async library as the underlay, technically it could follow this spec. But it's weird, because the user end will still prefer the way how async library is modeling.
I don't think ring should rely on async library. But maybe there's a way that if the adapter follows the spec, then the underlying part of async/sync is replaceable and pluggable.
In this case, assuming the adapter has its own way to support async handler, it should have a threadpool. So when using handler or middleware as a bridge, the result will be there are 2 threadpools. I think this is an obviously resource cost.
In practice this is unlikely to be an issue. You're not going to run into problems running a few extra threads unless you're running on extremely minimal hardware, and in such a case you probably wouldn't be using Clojure.
But maybe there's a way that if the adapter follows the spec, then the underlying part of async/sync is replaceable and pluggable.
The simplest solution would be to allow the executor used by the adapter to be specified in the option map, then you could share the executor between core.async and the adapter.
However, if you're resource starved to the point where you can't run more than a dozen threads, you probably have bigger problems.
About async API, keep 1 arg and assoc in request map:
:ring.request/async? boolean?
:ring.request/on-respond fn?
:ring.request/on-raise fn?
In this example you use an ad-hoc key (e.g. :ring.request/method
) to check which spec version the map adheres to. Please add an explicit key with an explicit versioning scheme so the spec is future-proof in regards to possible future updates.
:ring/api-version
) or possibly different versions for req/resp/push maps (e.g. :ring.request/api-version
?)you can build "easy" on top of "simple", but it's hard to do the reverse.
Well said.
About async API, keep 1 arg and assoc in request map:
:ring.request/async? boolean? :ring.request/on-respond fn? :ring.request/on-raise fn?
The 3-arity form for async responses was finalized 3 years ago in Ring 1.6, so I'd rather not change it unless there are significant advantages to doing so.
Additionally, there are a few disadvantages with adding functions on the request map over adding them as arguments:
Please add an explicit key with an explicit versioning scheme so the spec is future-proof in regards to possible future updates.
We don't need to add an additional versioning key just because there might be a third version of Ring several years (or decades) down the line. We can add a version key if and when we need it, and the absense of said key can denote an older version.
Secondly, I'm not sure we will need it. In Clojure we tend to type on keypairs, rather than maps as a whole. Future Ring specs would ideally just extend the existing maps with new keys. Additionally we have namespaces now, allowing us to do things like ring3.request/whatever
if we wanted to replace an existing key with one of the same name.
Thanks for all the effort you have put to the spec. Few comments:
Middleware is implemented as higher-order functions that take one or more handlers and configuration options as arguments and return a new handler with the desired additional behaviour.
Would it be possible to separate creating and applying middleware? The current way of doing things is nice for simple apps with one handler chain, composed with ->
. If you have multiple chains, each chain will have their own instances of the middleware. Many middleware (like the default session-middleware) have instance state (e.g. in-memory session store), which won't be shared between chains, unless explicitly overridden via options.
example with 2 separate handlers with both custom mw-chain:
(def routes1
(-> handler1
(wrap-something {:option 1})
(session/wrap-session)
(json/wrap-json)))
(def routes2
(-> handler2
(wrap-something {:option 2})
(session/wrap-session)
(json/wrap-json)))
(def handler [{:ring.request [path] :as req}]
(condp = path
"/route1" (routes1 req)
"/route2" (routest2 req)))
... both have now own versions of each mw, and at least the the session middleware has own in-memory store both both routes, making it effectively broken.
Suggestion: separate applying and creating of middleware into separate steps and change the naming of the middleware from wrap-xyz
to xyz-middleware
so it's easy to distinguish from the old model.
(session/session-middleware)
or (session/session-middleware {:store ...})
, should return a middleware function of handler -> request -> response
and/or handler -> request response raise
the example could be rewritten as:
;; just single instance
(def session-middleware (session/session-middleware))
(def routes1
(-> handler1
((something-middleware {:option 1})) ;; double wrapping to make it work like before
session-middleware ;; the instance!
((json/json-middleware)))) ;; the lib can decide to return a shared instance if needed
for simple chain creation, there could be a helper:
(defn chain [handler & middleware]
((apply comp identity middleware) handler))
allowing:
(def routes1
(chain
handler1
(something-middleware {:option 1})
session-middleware
(json/json-middleware)))
in reitit, the interceptors use this pattern and it's much easier to grasp what happen and when:
Currently, there is no official way to know wether a middleware function has 1 or 3 arity implemented. To know if a chain works in a given context (sync/async), you have to run the chain it with both arities. This is really bad IMO.
Suggestions:
1) get support for resolving function arities at runtime into clojure.core (or into ring utility lib?) + separate the creation and composing of the middleware so that we can inspect the created middleware instances before composing then together. Related: https://stackoverflow.com/questions/1696693/clojure-how-to-find-out-the-arity-of-function-at-runtime
2) change the middleware into interfaces/protocols, e.g. SyncMiddleware
& AsyncMiddleware
Protocols.
some other comments too, but will write another comment later.
Thank you for writing this spec. I believe the spec tackles all the server-side challenges for JVM Clojure.
I would like to propose two extension points:
1) The spec seems to be focussed on the JVM (e.g. the mention of java.lang.Throwable
). Since Clojure is more and more cross-platform (clj, cljs, but also things like borkdude/sci
, or Cljr, Clojerl, etc), I would like to see that the spec aims to target cross-platform Clojure (everything in cljc definitions). This would increase the reach of libraries built on top of this Spec.
2) The spec, like the title announces, is focussed on the part after having received the request by the server. I would like to propose to extend the spec to also take into account the generation of the request (from the client). This would help in the following ways:
Many middleware (like the default session-middleware) have instance state (e.g. in-memory session store), which won't be shared between chains, unless explicitly overridden via options.
As far as I'm aware, it's only the session middleware that has state. Could you give some more examples?
I'm not sure the use-case of default instance state is common for middleware, or necessitates changing the syntax. In addition it would need to be backward compatible, so while a zero-argument middleware might be possible, a one-argument middleware would not.
Further, if this is an issue, might I suggest that we just create a function to solve it? Something like:
(defn middleware-instance [middleware & args]
(let [mw-handler
(apply
middleware
(fn
([request] ((::handler request) request))
([request respond raise] ((::handler request) request respond raise)))
args)]
(fn [handler]
(fn
([request]
(mw-handler (assoc request ::handler handler))
([request respond raise]
(mw-handler (assoc request ::handler handler) respond raise))))))
My current inclination is that this is not a common enough use-case to warrant a middleware redesign.
To know if a chain works in a given context (sync/async), you have to run the chain it with both arities. This is really bad IMO.
Sorry, I'm not following you here. When you pass a handler to an adapter, it will be either be run exclusively synchronously or exclusively asynchronously. Why would you need to check both arities?
I would like to see that the spec aims to target cross-platform Clojure (everything in cljc definitions).
This is something that's in consideration for the future, but out of scope for the current spec. Also, differences in platforms mean we can't make the entire specification universal. For example, synchronous handlers are not particularly useful in single-threaded environments such as ClojureScript.
The spec, like the title announces, is focussed on the part after having received the request by the server. I would like to propose to extend the spec to also take into account the generation of the request (from the client).
That's not a bad idea, but that's also out of scope for the current release. It's a possibility for a future update, however.
Many middleware (like the default session-middleware) have instance state (e.g. in-memory session store), which won't be shared between chains, unless explicitly overridden via options.
As far as I'm aware, it's only the session middleware that has state. Could you give some more examples?
With quick search, couldn't find any other public examples ones out there. Hmm. Could the current session-middleware be changed so that the memory-store would be a shared value, like metrics-clojure does it?
To know if a chain works in a given context (sync/async), you have to run the chain it with both arities. This is really bad IMO.
Sorry, I'm not following you here. When you pass a handler to an adapter, it will be either be run exclusively synchronously or exclusively asynchronously. Why would you need to check both arities?
given a working sync ring application with n unique routes, each with a custom middleware chain, how do I know if that application will work in async mode? currently, by either reading the source code of all middleware & handlers or by invoking the application with requests (and respond and raises) that touch all paths and see if there was no arity-exceptions. Would be nice to be able to fail fast here.
Could the current session-middleware be changed so that the memory-store would be a shared value, like metrics-clojure does it?
We can't change existing behaviour.
given a working sync ring application with n unique routes, each with a custom middleware chain, how do I know if that application will work in async mode? currently, by either reading the source code of all middleware & handlers or by invoking the application with requests (and respond and raises) that touch all paths and see if there was no arity-exceptions. Would be nice to be able to fail fast here.
Thanks for the clarification. Let me give this some thought.
We can't change existing behaviour.
Aren't you already changing existing behavior (namespace-qualified map keys, etc.)? Is it an explicit goal for Ring 2.0 to be compatible with Ring 1.x middleware? If so, aren't the other changes already going to break them?
Aren't you already changing existing behavior (namespace-qualified map keys, etc.)? Is it an explicit goal for Ring 2.0 to be compatible with Ring 1.x middleware? If so, aren't the other changes already going to break them?
Ring 2.0 will be fully backward compatible with Ring 1.0. See the ADR section on backward compatibility for more details.
Hi @weavejester
What are your thoughts on moving Ring 2.0 requests/responses to protocols instead implying a narrow interface (ILookup) with specific keys? This is discussed in #372
Pedestal2 has also evolved this way (currently being developed and used within Adzerk, although some other companies have seen the initial details). Later when @ikitommi and Metosin started releasing their web projects, I had reached out to let him know we reached the same conclusion -- a protocol describing reading (requests) and writing (response) lets you bridge yourself into Ring-like things, without data copies or additional allocations.
I would love to collaborate with you and @ikitommi to get to a standard set of protocols we all sit on at the very foundation (if both of you were interested).
This is an issue for gathering feedback on the Ring 2.0 design.
The code and documentation will be held in the 2.0 branch of the repository.
The first draft specification for Ring 2.0 is now available to read, and accompanying it is an architecture design record that attempts to explain the decision-making behind the design.
New features for Ring 2.0 include:
InputStream
to protocolNote that Ring does not use semantic versioning. Ring 2.0 will be backward compatible with Ring 1.0, as described in the ADR.
As of writing this design has not yet been implemented in code. The next step is to create an alpha release that implements the draft spec, allowing the community to try the design out. I anticipate Ring 2.0 will remain in alpha for a while, in a similar fashion to Clojure spec, to allow plenty of time for community feedback and testing.