Open peachkeel opened 6 months ago
@pmeier @peachkeel It shall be done
This feature has been somewhat implemented in this branch on @peachkeel ragna fork. It is currently trying to mimic Common Logging Format.
Example Log:
('http', '127.0.0.1', 31476) - INFO [28/Mar/2024:16:39:54 +1300] - Uvicorn running on http://127.0.0.1:31476 (Press CTRL+C to quit)
The important lines are
log_config = uvicorn.config.LOGGING_CONFIG
log_config["formatters"]["access"]["fmt"] = "%(args)s - %(levelname)s [%(asctime)s] %(message)s"
log_config["formatters"]["default"]["fmt"] = "%(args)s - %(levelname)s [%(asctime)s] %(message)s"
log_config["formatters"]["access"]["datefmt"] = "%d/%b/%Y:%H:%M:%S %z"
log_config["formatters"]["default"]["datefmt"] = "%d/%b/%Y:%H:%M:%S %z"
uvicorn.run(
api_app(
config=config, ignore_unavailable_components=ignore_unavailable_components
),
host=config.api.hostname,
port=config.api.port,
log_config=log_config
)
A couple of questions:
%(args)s
in there? Isn't that information encoded in the startup message as well as the configuration file used? That seems like quite some noise to log with every message"access"
and "default"
formatters? "access"
makes sense to me. Is "default"
for all other messages?For now I will remove args and just add the timestamp like planned, but I'll see if I can find a way to add the host that doesn't look terrible.
@pmeier if you check the most recent commit, I've implemented the CLF standard more or less properly. I now understand the difference between access and default, one is for the api calls and one is for anything else.
The only difference is that we don't really have an ident parameter, so I've left it as "-", and I've added the logging level and replaced the bytes with the http status
@pmeier Are you satisfied with this style of logging? I'm happy to make whatever changes to it.
I know you and @peachkeel are closely connected, but the original feature request was for adding timestamps. And I'm totally aboard with that.
Regarding the CLF, I'm not sure. I'm certainly not very experienced with observability yet, but so far I haven't seen any application use it. That is not to say that it isn't used.
Meaning, I would be more comfortable if we only went with just the timestamps for now and maybe tackle a user defined timestamp and logging format as proposed in 4. of https://github.com/Quansight/ragna/issues/377#issuecomment-2024627533 later. Thoughts?
Hi, @pmeier. I'm a dinosaur, and I'm the one who suggested (offline) that maybe @Tengal-Teemo go with CLF since it has been recognized as a de facto standard:
https://www.rfc-editor.org/rfc/rfc6872.html
Such standardization brings benefits, especially in terms of reusing existing analytics tooling.
Of course, we don't have the bytes field as found in CLF, but that could be left empty with a "-" demarking it.
Anyway, I would support full customization with CLF as the default.
I'm a dinosaur
That could go at least two routes:
From the phrasing of the rest of the comment, I assume you mean something along the lines of 2., but I want to make sure here.
Anyway, I would support full customization with CLF as the default.
Happy to. Let's pivot in that direction.
One thing that I want to avoid, is to come up with a customization scheme that does not support common cases. One thing that I have seen a lot in the recent years is structured logging. I just did a quick search and found this blog that seems to suggest that it is not possible to do without actually injecting middleware into the application. @Tengal-Teemo could you make sure that this is the case?
If yes, we should be fine with just exposing the fmt
and datefmt
from https://github.com/Quansight/ragna/issues/377#issuecomment-2024627533 to the user. Otherwise, this might get more complicated.
@pmeier After looking into it, I believe that it depends what you mean by middleware. If you mean the BaseHTTPMiddleware they use, then no. Technically, you can just create a JsonFormatter
as the blog does and attach that to the logging root before you call uvicorn_run()
. The extra stuff they add can simply be obtained from the 'args' field of the record, and you could just check the record.name
to see if you're looking at a record generated by uvicorn.access
or not. However to do something like this yes you do need to override the default formatter used by uvicorn.
If we do want to expose the logs to the user, it might be better to expose a log_config.yaml
file that allows you to specify the fmt
and datefmt
, like below which I amended from this thread. However, this isn't the most workable framework for things like structured logging, as you're pretty much required to work within the formatters used by uvicorn if you use this method.
version: 1
disable_existing_loggers: False
formatters:
default:
"()": uvicorn.logging.DefaultFormatter
format: 'localhost:31477 - %(name)s [%(asctime)s] %(levelprefix)s %(message)s'
datefmt: '%d/%b/%Y:%H:%M:%S %z'
access:
"()": uvicorn.logging.AccessFormatter
format: '%(client_addr)s - %(name)s [%(asctime)s] %(levelprefix)s %(request_line)s %(status_code)s'
datefmt: '%d/%b/%Y:%H:%M:%S %z'
handlers:
default:
formatter: default
class: logging.StreamHandler
stream: ext://sys.stderr
access:
formatter: access
class: logging.StreamHandler
stream: ext://sys.stdout
loggers:
uvicorn.error:
level: INFO
handlers:
- default
propagate: no
uvicorn.access:
level: INFO
handlers:
- access
propagate: no
Ok, let's just roll with CLF for now and worry about this later. I prefer to make this configurable by the user, but having a more reasonable by default is higher prio now.
Feature description
Oftentimes, I'll start up Ragna like so to create a log:
ragna api 2>&1 | tee ragna.log
I can then
grep
the log to find errors or exceptions:grep -i -P -B1 "ERROR:" ragna.log
The results might look something like this:
It would be better, though, if they looked something like this:
Value and/or benefit
Frequently, I'm looking for
HTTP 429 Too Many Requests
or similar errors having to do with rate limiting. Knowing when they occurred can be beneficial in rescheduling API calls, especially if a daily limit has been breached.Yes, one could maybe look at the time the log file was last modified; however, that's a pretty fragile solution because any intentional or unintentional update to the log would result in the loss of that information.
Anything else?
cc @Tengal-Teemo