plack / psgi-specs

PSGI (Perl WSGI) specifications
http://plackperl.org/
133 stars 33 forks source link

logging broken requests #37

Open kazuho opened 9 years ago

kazuho commented 9 years ago

note: original discussion started at: https://github.com/kazuho/Starlet/pull/18#issuecomment-68482928

Ordinary HTTP deamons (e.g. Apache) is capable of logging broken requests. For example when you send a\r\n\rn to Apache, it emits 127.0.0.1 - - [02/Jan/2015:18:27:46 +0900] "a" 501 213 "-" "-" to the access log.

But logging such broken requests is impossible for PSGI-based standalone servers that rely on Plack middlewares for access logging, since the PSGI specification does not allow servers to push broken (or incomplete) requests to handlers.

Should the spec. be extended to allow Plack-based loggers to log broken requests?

miyagawa commented 9 years ago

Thanks for opening the issue!

As you can follow the thread on Starlet, there are two possible suggestions:

Allow psgix env var to indicate that the current request is incomplete/broken

Server will (optionally) mark the broken or invalid request with psgix.invalid_request and such, and send as a regular PSGI env to the application. Application will look at that PSGI environment variable and do whatever it wants to do, such as logging to a special error log, or just throw it away.

Because PSGI specification does not currently have a way for application to server, there needs to be a way for developers to turn on this option, because otherwise PSGI application that are not aware of will be sent such requests and cause unexpected behaviors (as stated in the original issue).

Servers support Error Handlers app

In addition to the regular (main) PSGI application, servers will accept a separate PSGI application handler to handle such errors. This is close to Apache's error handler directive, and actually Plack has its ErrorDocument middleware that accepts such error handlers (although the implementation is a bit rough).

This probably does not need a revision in the PSGI spec itself, but could have a standardized/common way to specify such error handlers across server implementations.

Because you specify two different PSGI applications, developers have a complete control of specifying what to do in the error case, which might be useful if you want to deploy a PSGI application written by third party (such as the one on Github or CPAN).

The downside would be you possibly have to duplicate the middleware stack, both for the main application and error handler application, and if you need to share some resource such as open UNIX socket or file handle, it might be difficult or impossible, depending on what you want to do.