"Serve" command logs aggressively to console

SkyCrawl commented 8 years ago

Just ran into one more issue with coldsweat. When I tell the startup script to redirect all output to the NULL device and not to block execution, the result is:

SkyCrawl $ python sweat.py serve -r -p 3333 > /dev/null &
[1] 22896
SkyCrawl $ 10.0.0.45 - - [07/Sep/2016 21:54:40] "GET / HTTP/1.1" 200 46846
10.0.0.45 - - [07/Sep/2016 21:55:00] "GET /feeds/ HTTP/1.1" 200 32253
10.0.0.45 - - [07/Sep/2016 21:55:00] "GET /favicon.ico HTTP/1.1" 200 1150

Now the script certainly doesn't block further execution but it still prints output to the console and perhaps might interfere with further execution. Is there an option to suppress or redirect server logging?

EDIT: I think that ideally, there should be an option to redirect output either to coldsweat's log or the output defined by the calling script (i.e. my example above). Although personally I have no qualms about the console serving as default output, many applications prefer the log instead :). So, perhaps:

# should redirect to the log:
python sweat.py serve -log -r -p 3333 &
# should spawn an ambiguity error:
python sweat.py serve -log -r -p 3333 > /dev/null &
# if the '>' shell invocation is a problem for Python, perhaps:
python sweat.py serve -nolog -r -p 3333 &

A quick look has revealed the following traceback:

Seems like subclassing BaseHTTPRequestHandler and changing the following methods could be a solution:

log_request([code[, size]])
log_error(...)
log_message(format, ...)

See also:

passiomatic commented 8 years ago

Thank you for the analysis. I've never intended the "serve" command as a deploy solution but since you guys find it good enough let's find a way to make it less noisy.

First, I believe web server log verbosity should be controlled by the config log.level option. Second make_server() should always log to the Coldsweat logfile.

Hence, set the log level to INFO (the default) would log errors only (hopefully leaving the log almost empty). Set the log level to DEBUG would log server detailed activity (as today) and errors.

What do you think?

SkyCrawl commented 8 years ago

Personally, I found it good enough at the time I discovered and tested it :). Compared to other alternatives, Fever API and the simplicity of Coldsweat simply win. Up until now I haven't been able to compile/run coldsweat on my NAS because I categorically refused the hassle of cross-compilation but that issue seems to finally have been somewhat resolved. At this time, I want to be able to launch coldsweat automatically when the NAS starts though, so it's time for deployment :).

I believe web server log verbosity should be controlled by the config log.level option.

I tried setting the log level to INFO, WARNING and ERROR - none of them changed the outcome. The log file doesn't contain access log, I can only see some operational logs, e.g. changing location of a feed.

What do you think?

I think that applying a log level to web server logger partly defeats its purpose (if an error happens, you'll usually want to know which request spawned it and that's, by the looks of things, the INFO level). Personally, I would separate access and operational log levels. Also, there's the question of separating the logs "physically" (separate files).

EDIT: If it were up to me, I think I would have changed log.filename to something like log.op and add log.access, with the default of pointing both logs to the same file. Then, I would add a -deploy CLI option that:

Doesn't block.
Writes access log to the configured file.

Even if we can use & in Linux/Unix environments, Windows users might not be so lucky (hence the special option). If the -deploy option is not used, access log should probably still be directed to the console, regardless of the configured file.

This way, it seems quite platform-independent, flexible and simple to me :).

passiomatic commented 8 years ago

I tried setting the log level to INFO, WARNING and ERROR - none of them changed the outcome. The log file doesn't contain access log, I can only see some operational logs, e.g. changing location of a feed.

Yes, actually it doesn't change a thing - since log level does not influence the built-in web server logging. What I described in my previous comment was a possible development.

I think that applying a log level to web server logger partly defeats its purpose (if an error happens, you'll usually want to know which request spawned it). Personally, I would separate access and operational log levels :). Also, there's the question of separating the logs "physically" (separate files).

In a sense, yes, they are two different beasts and they should stay separated.

Perhaps the root cause of the issue is that log_error() and in turn log_message() outputs to sys.stderr instead of sys.stdout.

SkyCrawl commented 8 years ago

I updated my previous post.

Perhaps the root cause of the issue is that log_error() and in turn log_message() outputs to sys.stderr instead of sys.stdout.

Yes, that could be it :).

SkyCrawl commented 8 years ago

Wonderful, I can confirm that changing the log_message() method as described works like a charm :). Thanks! Full changelog for commands.py:

# replace:
from wsgiref.simple_server import make_server
# with:
from wsgiref.simple_server import *

# add:
class MyWSGIRequestHandler(WSGIRequestHandler):
    def log_message(self, format, *args):
        """Log an arbitrary message.
        This is used by all other logging functions. Override it
        if you have specific logging wishes.
        The first argument, FORMAT, is a format string for the
        message to be logged. If the format string contains any
        % escapes requiring parameters, they should be specified
        as subsequent arguments (it's just like printf!).

        The client ip address and current date/time are prefixed to every
        message.
        """
        sys.stdout.write("%s - - [%s] %s\n" %
                         (self.client_address[0],
                          self.log_date_time_string(),
                          format%args))

# replace:
httpd = make_server(address, options.port, cascade_app)
# with:
httpd = make_server(address, options.port, cascade_app, WSGIServer, MyWSGIRequestHandler)

A bit of a hack that comes with a little maintenance but it's probably a more correct solution as it respects the user's wishes. If you'd like me to make a PR, just say the word :).

passiomatic commented 8 years ago

No need for a PR, the change is trivial to do. It seems to work very well on OS X and it makes possibile to redirect output to /dev/null without the need of another command-line option.

This will land in 0.9.7, I guess.

SkyCrawl commented 8 years ago

Ok, many thanks and keep up the good work :).

passiomatic / coldsweat

"Serve" command logs aggressively to console #101