haxetink / tink_http

Cross platform HTTP abstraction
https://haxetink.github.io/tink_http
33 stars 19 forks source link

Middleware #30

Closed benmerckx closed 7 years ago

benmerckx commented 8 years ago

Hi Juraj, I've watched your wwx talk. My thought afterwards was about the sharing of code between stuff that runs on tink_http. For example I've written initial gzip compression middleware for monsoon here. There's still a lot to be done there (and compression is probably better handled through nginx or the like), but aside from that I wonder if we could define some format using only tink_http requests/responses which would make it easy to create this as a seperate lib which can then for example also be used in tink_web, or ufront if/once it supports tink_http. If a common format/interface can be defined I can create a sort of wrapper/abstract in monsoon to use this kind of middleware. It would also need to allow defining an action which would run after the response is 'done', and maybe some other actions, I haven't thought that part through yet. What do you think?

back2dos commented 8 years ago

Well, technically I think this is a middleware:

typedef Middleware = Handler->Handler;

It's almost insultingly simple, but I think that's what makes it so beautiful.

This would work for decompression:

function decompress(h:Handler, ?handleFailure):Handler {
  if (handleFailure == null)
    handleFailure = OutgoingResponse.reportError;
  return function (req:IncomingRequest) 
    return 
      if (req.isCompressed()) //too lazy to look it up, but let's imagine a static extension tells us this
        Future.async(function (cb) {
          req.body.all().handle(function (o) switch o {
            case Success(data): h(new IncomingRequest(req.clientIp, req.header, data)).handle(cb);
            case Failure(e): cb(handleFailure(e));
          });
        });
      else
        h(req);
}

For compression it would look similar. And for chunked response, i.e. if the response has no content-length, construct a new response with chunked encoding - there's multiple ways to go about that.

Generally, tink_io could probably benefit from a counter part to node's transform streams, so that all of this can be implemented in a streaming fashion.

Beyond that, I can also see (rudimentary) caches could work pretty much exactly the same way. A cache might take a handler, possibly a function for calculating a key from an IncomingRequestHeader and return a caching handler.

I know this is not exactly in line with the way express/connect define middlewares, but I think it's more suitable for Haxe. Many things that express does in middlewares, I would prefer to execute later. I've tried to explain that in #18 already. So for example you might just define a static extension like so:

static function parseBody(req:IncomingRequest):Future<ParsedBody>;

//in your handler
@await req.parseBody();//not so much longer than `req.body`, but we do it just in time, which I think is better for security and performance

As for running something when the response is done (as in served?), consider this:

function onSourceEnd(s:IdealSource, cb:Void->Void):IdealSource { /* create some subclass of IdealSourceBase that forwards to `s` but calls `cb` when an EOF comes through */ }

var onEnd:Middleware = function (h:Handler, cb):Handler {
  return function (req)
    return h(req).map(function (res) return new OutgoingResponse(res.header, onSourceEnd(res.body, cb)));
}

I'm only sketching this out and I'm not suggesting that it handles all the use cases express middlewares cover. But for those that fit, I would prefer to stick to this simple approach. As for the rest, I'm open to suggestions.

benmerckx commented 8 years ago

Thanks for the clarification, this makes sense. I'll see if I can get my current situation in line with this when there's a new tink_http release. As for being 'done', I wasn't really talking about the moment after a request is served. What I'm really struggling with is the order of executing the middleware. I'd like compression to run last, after other things that might change the response. This works for now, but the way I've implemented it feels suboptimal, or really just crap. But I think it's a limitation of the way expressjs handles things (which I've copied mostly). I know it's not exactly a perfect fit for haxe, but even though I haven't done a lot of nodejs work in the past, what I really like about express was that I could get started within a few minutes of scrolling through the documentation. I wanted to keep that going and also figured I wasn't going to do any better if I tried myself (at least for now). I hope taking this approach might point me in the right direction of getting a grip on the execution order through a simple api.

back2dos commented 8 years ago

Think out loud, I'd say there should be something like this (in tink_io ?):

interface Compression {
  function compress(source:Source):Source;
  function decompress(source:Source):Source;
}

Then the different compressions could be provided as implementations of that interface. With that, decompressing an IncomingRequest becomes rather trivial.

back2dos commented 8 years ago

Well, "first" and "last" and generally "order" in that sense are bit of an express-like notion. Their middleware appears to be modeled after synchronous imperative programming. In tink_http I'm more leaning towards a functional approach of composing values:

The types themselves define order. Consider the return type of Handler, which is Future<OutgoingResponse>. So we're saying: ok, give me the response when you're ready. In that response, all headers must already be set (technically you can modify the Array after creating it, but let's ignore that), but the body itself does not have to exist yet. It's just some stream that you have to get from somewhere, that generates data as the consuming Sink will require it in the end (thus sidestepping the whole issue of backpressure nicely). So you do get to start your response ASAP and produce the body step by step, but you simply can't output stuff to the body first and then try to set a header (as you can in most APIs only to get exceptions). You also can't forget to output something (which in node hangs and in PHP/neko whitescreens). The actual response (the thing that is sent down to the client) is not a side-effect produced by your calls. It is the thing that you are required to return. Of course, you can return futures that never happen (e.g. by return triggers you don't set off or having a Future.async where you never call the callback) or sources that never end. Nothing is perfect ... but I think it's quite an improvement still.

Given the Compression interface above (and the implementations), it's easy to make a Middleware that takes a Handler and creates a new one that calls the wrapped handler with a (if necessary) decompressed body and maps the resulting future response through a function that compresses the body (if accepted ... also, the Compression should probably have a compressSafely for that). So rather than chaining middlewares in a linear fashion, you wrap them around handlers (where it's up to you if and when you want to modify the request or the response), layer by layer, a bit like growth rings, until you have yourself a big honking tree. Doing something first or last is a matter of being in the outermost layer.

You make a good point about making it easy to get started. I can see how tink_http might scare off newcomers, but in the end it is really about proposing a common ground for web frameworks to build onto. Which is not to say that I want it in any harder to use than really necessary, so if there's something we can make less quirky, do let me know. I'd like to think that it's not really harder to deal with than vanilla node, but it's very hard to judge that from my position. That aside, I think it's good to make it very strict and low-level, so that frameworks get all the room to maneuver that they can have, take on the least possible overhead and are yet written against a robust and portable API that delegates enforcing the order of things to the type system, which usually does a very good job at ensuring things it understands.

Anyway, I hope this kind of shines a light on how to get control over execution order. If not, let me know. Or if you see a simpler alternative, I'm interested to learn more :)

back2dos commented 8 years ago

Ok guys, what do we want to do about this?

Maybe we need to bring this down from the abstract level and compile a list of "common" middlewares that every project might find useful and then see if we can find an abstraction (or a handful) to fit all those cases.

So please, everyone, brainstorm away! ;)

kevinresol commented 8 years ago

Some middleware I used, but not all are "common" i guess

benmerckx commented 8 years ago

Well, Kevin already sums up a lot of useful stuff. I can add gzip, caching, basic auth... I'm also planning something for managing security headers similar to helmet.

I think I'm settling for this in monsoon (of which I keep a list). It allows me to add extra @:from definitions when needed. While Handler -> Handler works for expressing middleware the 'inverted' way of applying those over each other doesn't feel intuitive at first. That can be solved multiple ways, but is I think a problem for the framework over tink_http. In any case I'm rewriting everything now and am going for Handler -> Handler when defining middleware unless some other definition follows here.

One thing I wonder is if we can/should make a response more mutative. There's currently no way to set header or body after creating a response. Should middleware which, for example, adds some headers always construct a new response? And is it in that case the responsibility of the framework to hand the user some methods to do such transformations easily?

kevinresol commented 7 years ago

I would add one more middleware here: (just as a reminder) One that matches incoming referrer and set the allow-origin headers correspondingly

kevinresol commented 7 years ago

Are there any chance that a middleware would work in a first in first out way? The Handler->Handler make it goes last in first out which is ok, unless someone needs to be FIFO.

kevinresol commented 7 years ago

Added PR: https://github.com/haxetink/tink_http/pull/58

kevinresol commented 7 years ago

Closing for now