ring-clojure / ring-json

Ring middleware for handling JSON
316 stars 47 forks source link

Add stream support for response body - piped-io-stream impl #59

Closed pete23 closed 5 years ago

pete23 commented 5 years ago

Following patch implements streaming support as per discussion in https://github.com/ring-clojure/ring-json/issues/58 - apologies for the delay.

Note that streaming is configured for the whole handler - it's either all or nothing. Testing for a lazy sequence to enable the stream doesn't make sense in retrospect - I might have a large on-heap fully realised structure I can't create an on heap string for.

The default is no streaming; the wrapper needs to specify { :stream true } as an option to enable this functionality so I think your request that this be unobtrusive and not change any defaults has been honoured.

Added a couple of functional tests - let me know if you would like more coverage.

In terms of non-functional tests,

(defn core-handler [req] {:status 200 :headers {"Content-Type" "text/plain"} :body (repeat 100000000 {:a "hello" :b "world"})})
(def handler (wrap-json-response core-handler {:stream true :pretty true}))

This test crashes after just over a minute with java.lang.OutOfMemoryError: Java heap space with stream = false and default Java 12 JVM settings.

With stream = true it completes generation of 3.5Gb of data (at a rate of around 21.7Mb/s) with heap size fluctuating under 100Mb as garbage is generated and collected.

pete23 commented 5 years ago

OK think have resolved everything. Also updated the versions tested so I can confirm that the version I use in anger is ok.

Apologies if there any stupid PR newbie issues, this is my first time doing this on GH.

pete23 commented 5 years ago

OK - changes made as requested and commit squashed (now that's a scary process). Hopefully we are now good! If not let me know and we'll give it another shot.

weavejester commented 5 years ago

Sorry for taking so long to respond to this. I've been working my way through a sizeable backlog of PRs.

This PR uses piped-input-stream, but past Ring 1.6 there's a better way to achieve the same result. I'm debating whether to use piped-input-stream for compatibility, or to use StreamableResponseBody, or to check for the presence of the protocol and use one method or the other. I'm leaning toward the latter, but I may merge the PR and make the changes in a later commit.

pete23 commented 5 years ago

ring-json seems to have a dep on ring 1.6.2 anyway. Do you want me to recode this patch for the StreamableResponseBody protocol instead?

Runtime checking and providing both implementations seems like gilding the lily to me but obv your call.

weavejester commented 5 years ago

Let's use the StreamableResponseBody protocol, and if anyone complains we can add extra implementations :)

pete23 commented 5 years ago

Actually having read https://www.booleanknot.com/blog/2016/07/15/asynchronous-ring.html three times now I'm still not sure what the StreamableResponseBody version would look like. Either:

1) Extend the protocol for the types processed by wrap-json-response to stream out JSON, and hence obviate the whole middleware? ring-json currently picks up anything that gives true to coll so that would be pretty fundamental. Maybe not.

2) Have our own helper class that extends the protocol, wraps and then JSON serialises the body content.

So 2 then. Update the request to have a JSONResponseBody that implements StreamableResponseBody protocol, holds the original body, and does the Right Thing when called wrt streaming, and that this behaviour should be behind an optional flag in the same way as the current patch.

I'm out of steam for this evening and might not get a chance to look at this for a couple of days. I'll start from scratch with a separate PR when I do - I'd like to leave this here for now as it applies cleanly and does the job, in case anyone needs the job doing before we get it right:-)

weavejester commented 5 years ago

Right, I was thinking 2:

(defrecord JsonStreamingResponseBody [body options]
  ring.core.protocols/StreamableResponseBody
  (write-body-to-stream [_ _ output-stream]
    (json/generate-stream body (io/writer output-stream) options)))

And then replace generate-input-stream with ->JsonStreamingResponseBody.

This avoids having to pipe through an intermediate stream - we can just write directly to the output stream that's exposed by the web server.

pete23 commented 5 years ago

OK, this is where my head was at by the end of my previous post too. I’ll find time tomorrow to apply, test, and ship another PR.

pete23 commented 5 years ago

Ok the better version is in #60 now, I'll close this off.

weavejester commented 5 years ago

In general I prefer that changes are made to the existing PR rather than a new one created. There's no advantage to creating a new PR; it just makes it harder to read the thread of comments and code reviews.

pete23 commented 5 years ago

As the other version was a different implementation entirely I thought it warranted a separate PR.