Open noklam opened 4 months ago
-1 on keeping the current logging.yml logic - whoever wants fine grained control of logs, file logging, rotation etc should be using journald, supervisor, Datadog, or whatever other solution. this is not Kedro's responsibility
@astrojuanlu While I agree it's probably not what Kedro should do, it does helps the developing experience, alternatively we will need some kind of progress bar as that's why Kedro INFO log are doing roughly. Plus I don't see a big problem keeping logging.yml
, is there any major benefit moving away from logging.yml
? Changing logging.yml
is easier we can do it in a non-breaking way in 0.19.x.
Adding some color to my earlier statements on OpenTelemetry, logging etc:
OpenTelemetry seems to be quite mature for traces (as pioneered by OpenTracing), metrics (Prometheus, the former OpenCensus) but not so much for logs. In fact, the client APIs for logging in Python are in development and seemingly unstable:
While signals are in development, breaking changes and performance issues MAY occur. Components SHOULD NOT be expected to be feature-complete. In some cases, the signal in Development MAY be discarded and removed entirely. Long-term dependencies SHOULD NOT be taken against signals in Development.
In fact, there seem to be some inconsistencies still.
Looks like good practice nowadays involves having a log collector (Promtail, Fluentd, Logstash, Grafana Agent Alloy) that then send logs to a service (Loki, Elasticsearch).
The dream of having apps just log JSON to stdout is actually spelled in the structlog docs:
Colorful and pretty printed log messages are nice during development when you locally run your code.
However, in production you should emit structured output (like JSON) which is a lot easier to parse by log aggregators.
A simple but powerful approach is to log to unbuffered standard out and let other tools take care of the rest.
That can be your terminal window while developing; it can be systemd redirecting your log entries to syslogd and rotating them using logrotate; or it can be your cluster manager forwarding them to an obscenely expensive log aggregator service.
So I still think that we shouldn't have a too heavy handed approach to logging, but I now have more context on how this is actually achieved, and what to expect from the current ecosystem.
Split out from #3591
Context
I did a demo a while ago showing how frustrating it is to try to change logging level. With #3446 and this ticket, it will make customise logging easier for our users.
3591 is a separate ticket that change logging directly for debugging purpose
Problem
https://github.com/kedro-org/kedro/blob/da709d4316c141c5a7d6f676a87a5752807b33f4/kedro/templates/project/%7B%7B%20cookiecutter.repo_name%20%7D%7D/conf/logging.yml
There are many
level: INFO
settings in the template, one may expect changing them to see more verbose logging. The consequence is that you need to change multipleINFO
toDEBUG
in order to see theDEBUG
level message. So we basically provide a knob that doesn't change anything (technically it does, but it's most likely not what our user need, and for advance users they can figure out how to do advance filtering)Proposal
https://github.com/kedro-org/kedro/blob/da709d4316c141c5a7d6f676a87a5752807b33f4/kedro/templates/project/%7B%7B%20cookiecutter.repo_name%20%7D%7D/conf/logging.yml#L11-L16
logging.yml
https://github.com/kedro-org/kedro/issues/3446#issuecomment-1979711477
kedro run
If we do 1., this will be basically adding addition logger in
loggers
section, but there is also a problem how plugins can do this easily or maybe it should be done at the package level. This can actually solved by #3591, advance settings will remains the same, which is adding a newloggers
or setting this with package level logging.I don't have a better solution than the current one yet. Here are things that we know: