Research logging library

brianwgoldman commented 5 months ago

We should have a standardized way to log in NewHELM. There was some discussion in #78, which I'm pulling out into an issue for future consideration. Some considerations:

HELM uses a custom hierarchical logger. While it definitely has some nice features, if we want plugin writers to use the same logging strategy we should probably pick something more standard.
@dhosterman said: "If you're looking for a more expressive and flexible logging framework, you might want to check out structlog."
We could also always start with the standard library for logging and upgrade when needed, as suggested by @mkly.

dhosterman commented 5 months ago

I definitely agree that something closer to the standard logger would be preferable. Do we have any insight into what drove the creation of the custom logger to begin with? Seeing that set of requirements and reevaluating it seems like a good place to start.

wpietri commented 5 months ago

I agree that we could use better logging. My first question: who's logging for, and what will they be doing? Some things that seem likely to me:

developer working on core code trying to debug something
External developer adding a plugin and trying to get it right
- SUT
- Test
External user trying to figure out why a third-party plugin isn't working and write a good bug report
MLC staff doing major benchmark runs:
- making sure there are no errors
- detecting non-critical problems, like a slow or flaky third-party API
- solving some issue with a run

The last case definitely pulls me toward structured logging, as that way you can easily turn logs into alerts and metrics. I also think we should default to logging into one or more files with a fair bit of volume at the INFO level, and that it should be pretty easy to turn on DEBUG logging for the developer cases (1 and 2).

For the user case, 3, I think we mainly want to rely on console error messages, not logs, but it's inevitable people will sometimes need to go deeper. So the logs should at least be easy to find and attach to a bug report.

Are there other use cases people have in mind?

mlcommons / modelgauge

Research logging library #89