Closed back2dos closed 8 years ago
Yeah, the container change sounds good to me.
But I just can't wrap my head around it when a streamed body meets chained handlers or pre/postprocessors, because if one handler exhausted the body the next handler won't be able to read it anymore. How does that work actually?
@kevinresol Well, such a pre/post-processor would construct a new body, which is basically what you would do for compression anyway. Or you can buffer the response and do stuff like this:
function cache(h:Handler):Handler
return function(req) return
if (req.method != GET) h(req);
else Future.async(function (cb) {
h(req).handle(function (res) {
res.body.all().handle(function (o) cb(switch o {
case Success(bytes):
var etag = haxe.crypto.Md5.encode(bytes);
switch req.header.get('etag') {
case [same] if (same == etag):
new OutgoingResponse(
new OutgoingResponseHeader(304, "Not Modified", []),
Empty.instance
);
default:
new OutgoingResponse(
new OutgoingResponseHeader(
res.header.statusCode,
res.header.reason,
res.header.fields.concat(new HeaderField('etag', etag))
),
body
)
}
new OutgoingResponse(new OutgoingResponseHeader(
case Failure(e):
}));
});
});
}
Which would add caching to any handler you want.
But yes, you can't just modify the body in place, if that's what you're asking?
Maybe I am still thinking in the "expressjs" way, that there will be a "body-parser" which reads and parses the body into a readily-consumable form, which can be used by subsequent handlers.
Maybe the "tink" way is quite different. There seems to be one and only one Handler would ultimately "consume" the body. So that we don't need a separate body parser, because the Handler should be responsible of parsing the body itself. Isn't it?
You can do that in a vaguely "expressjs like" way, if you construct a new request. A problem here is that requests are not that easily extensible, because of static typing (e.g. there's no place for the parsed body in IncomingRequest
).
Some options are:
IncomingRequest
for additional stuff.Handler
is IncomingRequest->Future<OutgoingRequest>
, in some framework it could be ParsedRequest->Future<OutgoingRequest>
and then the framework's job is to tranform IncomingRequest
to ParsedRequest
. You can look at monsoon's body parser here: https://github.com/benmerckx/monsoon#bodytink_web
, where the routing generates the body parsing code.But in any case, I think you will want to use a more faceted approach than express' one-size fits all solution. Unless someone here has a good idea how to make it work in a statically typed language? Otherwise I guess it will be up to frameworks to solve it on their own abstraction layer.
Still, static file handling, compression and possibly even caching could be provided in framework-agnostic libs, as none of them need to change the structural representation of requests or responses.
I'm all for the api changes. One thing I wonder is if this would also be compatible with some future implementation of sending chunked responses. But I don't know if and how you'd like to handle that.
I also feel an IncomingRequest should be as immutable as possible. Express adds all kinds of stuff on the request, which feels wrong, especially in a typed language. It requires a bit more thought but it should result in more fail-safe api.
Regarding the body parser in monsoon. The middleware is not exactly finished, the idea being middleware would always the passed if used in a previous method (so the parsed body string stays available, without having to parse the source again which would fail). I'm now unsure if I had completely finished that or not.
Chunked responses can be handled by writing an abstract over OutgoingResponse
enumerated with a Stream
, not a lot of code changes with a @:from
.
The only way to sensibly deal with the polymorphism of requests in Haxe is to use dependency injection to simulate polymorphic dispatch by closing over a constructor function, and then triggering a thunk.
Check veins-di, it's as simple as can possibly be, I think.
Even the highfalutin code in haskell pipes winds up with a call to runEffect()
once all the composition is done, so there isn't much to be gained by complecting routing on this level.
my 2¢.
@0b1kn00b agreed with the DI solution. @back2dos any updates for the new container api?
Ok, there's some progress here. I still have the ModnekoContainer
to actually implement.
There's a somewhat bigger change I wanted to consult about with you guys first. It seems that in PHP (and also mod_neko) there's no real way to get raw multipart requests. The files are always stored on the disk and you get the names.
I'm therefore inclined to change the body of IncomingRequest
to something like:
enum IncomingRequestBody {
Plain(source:Source);
Multipart(parts:Stream<{ name:String, filename:String, content: Source }>);//or something like that ... will have to do some more reading
}
That would reflect the difference. For TcpContainer
and NodeContainer
you would always get plain sources (but the constructor might accept an argument to emulate the same behavior).
Thoughts anyone?
I am not sure if anyone will want to move the saved file around without reading it again. Maybe content
is an enum of Source(source:Source); File(path:String)
. Then if it is a File, user can create the read stream from the path manually, or just copy it to somewhere.
Yep, you are right. I hate this already, but I don't see what else we could do ... :D
On Tue, May 24, 2016 at 12:11 PM, Kevin Leung notifications@github.com wrote:
I am not sure if anyone will want to move the saved file around without reading it again. Maybe content is an enum of Source(source:Source); File(path:String). Then if it is a File and then user can create the read stream from the path manually, or just copy it to somewhere.
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/haxetink/tink_http/issues/18#issuecomment-221225308
I don't really mind the IncomingRequestBody update, what is there to hate?
Well, there's just so many predetermined cases to handle for the application, instead of letting the application establish what cases it wants to discern. I'm actually using tink_http for proxying, in which case anything that interprets the request body (in particular by streaming it to temporary files) is a nuisance.
But even without that, I think how the body is treated should be the application's responsibility. Then the application can first route the request according to the URL, retrieve the user from session cookies and then decide how much of what that user is allowed to stream to the server and how to handle the data. To have the user wait for a huge file to upload, only to have tell them that their session timed out, or permissions changed in the mean time or whatever, is rather silly. The container should hand of the request to the application ASAP.
All this automatic behavior is surely well-intended, but it just blurs the separation of concerns, potentially leading to quite negative effects. I had hoped to be able to opt out of it. The more I learn about HTTP the more I am delighted by many aspects of its design and the sadder I am to see how some of the most common interfaces for it sabotage that on a quest for convenience, that should really be pursued on framework/library level rather than baked into the lowest one programmers get to deal with. I think nodejs got this right, while all the CGI-derivatives made a crucial mistake. And what I "hate" is to have to carry that unfortunate decision into our interface too.
This might help solve the problem stx_simplex. It models a simplex, so you can either receive more input, or push to output at every iteration (and the dependent output handlers won't be called until you do) . check the enum
I've got a feeling this might need tail recursion to operate properly, but if you look at the haskell pipes mvc, this is the central piece, mapping an input stream to an output stream composably. I just got pipe working (I think)
Thanks for the explanation, I hadn't thought about the proxy use-case (I am using node-http-proxy in a project, should have a look at replacing that). Then again you wouldn't be able to proxy multipart requests through php/mod_neko anyway, or you'd have to reconstruct the whole body.
Am I right in thinking the body won't get parsed until you start using the parts stream (except for php/mod_neko where you have no control over it)? And that the parsing could happen at a later stage (even in a seperate lib maybe) but that would make things behave too different on php/mod_neko vs the rest?
Indeed, the body will not be parsed at all, until the application decides to do it. For example if the user is not authenticated, then you might simply reject a POST-request without bothering to ever read the body. To me this seems far more flexible and more secure (assuming that the application author enforces sensible restrictions).
I would like to (at least potentially) have a veritable zoo of containers, including Twisted and Django and some plain WSGI for Python, a Netty based version for Java and what not (Justin's experiments with Lua in Nginx also interest me). In many of those cases the container will only give you the body in a preparsed way. So ultimately, I think we'll simply need to have a decent (general enough to cover most use cases, but still actually be usable) representation of parsed bodies. Reconsidering my impulse to make it a Stream, I think making it asynchronous is more trouble than it's worth though, because the container layer will almost certainly provide the data all at once.
An additional abstraction could be created to always deliver parsed bodies, but that's probably best to be solved in an extra lib.
Ok, I think it's converging. There is still quite a bit of work to be done on the implementation side, but interface-wise I would propose the current state. Looking forward to your feedback ;)
Well, seeing the lack of opposition, I think I'll just polish this a bit and fill at least some of the implementation holes.
So IMHO this is now ok. If you have general problems with the changes, feel free to reopen this issue. Otherwise we can discuss more specific stuff in separate issues.
As @kevinresol pointed out in #17 there's no way to find out when the server closes. The truth is, I omitted this, because I had no good idea how to properly express it :D
Generally, the whole Container/Application API is a bit crude, to say the least. For example, nothing prevents you from running multiple applications on the same container and what not. Also, in "cgi-like" environments, you don't get to react to the server closing (which is why chose not to expose that at all).
My primary focus lies on a decent API of requests / responses and good implementations of the protocol itself, which is why I have neglected Container so far. But we should definitely deal with it. Maybe @benmerckx also has some input on that subject?
I think I would prefer to get rid of
Application
altogether, as it doesn't compose well (it's cumbersome to plug together two apps currently - further below I will illustrate how I feel it should be). So something like this might be good:It's a bit fatter than the status quo, but I think it does model the different scenarios somewhat closer.
Building on the above, this is what I mean by easier composition:
This is just an example of how nicely handlers could potentially be strung together, to create complex apps. Not sure the code even compiles and I don't mean to discuss it's specifics, only to illustrate what could be won by having composable handlers, rather than the bulky Application. Also, it seems a good idea to separate life cycle management for request handling. But I probably digress.
Anyway, my main point is, the current API needs revision anyway and I would like to replace it by something better, that does not just deal with the particular issue @kevinresol raised, but is something we can commit to.
So please pour all your thoughts here!