Open tiran opened 1 month ago
cc @lintangsutawika for comment as well, but would be glad to accept these changes I think!
Thanks for bearing with us and for your interest--the Harness has grown from being a smaller research codebase to a now much more widely used library over time.
Thanks, I would love to have such a PR and learn about logging best practices!
For libraries, it is super easy. If you need a logger in a model, simply do:
logger = logging.getLogger(__name__)
and you are all set up. You should not import a logger from a central location, pass a logger as argument, configure logging, or do any other hacks with a logger.
The logging framework recognizes dotted Python names and creates a hierarchy from Python names. In your case, the logger from logging.getLogger("lm_eval.evaluator")
inherits settings like log level and handlers from its auto-generated parent logging.getLogger("lm_eval")
. The parent gets its settings from the root logger logging.getLogger()
.
When an application follows this practice, then an application can use predicable names to configure loggers:
import logging
logging.getLogger().setLevel(logging.ERROR)
logging.getLogger("lm_eval.evaluator").setLevel(logging.DEBUG)
import lm_eval.evaluator
This sets all logging to ERROR (including "lm_eval"
) and sets "lm_eval.evaluator"
to DEBUG although the module was not imported at the point in time.
Do these practices retain abilities of lm_eval when used as standalone application: default logging config and verbosity level control? If no, what will be proposed solution for backward compatibility of usage practice (some sort of setup under if __name__ = "main"
conditional maybe)?
@tiran I started a PR relevant to this. But I'm still unsure how to allow logging to also function like before.
I think I got it. Would still appreciate your thoughts on it.
lm-eval does not follow best practices for
logging
in libraries. This makes it harder to use lm-eval in applications that have their own opinions on logging configuration.If you are interested, I can create a PR to address the problems.
logging.basicConfig
The library calls
logging.basicConfig
at import time. Libraries should not configure logging themselves. It's the job of the application or main script to configure logging and set up log handlers according to its needs.basicConfig
is a one-shot API. The second and any following call are a no-op, unless it is called withforce=True
.https://github.com/EleutherAI/lm-evaluation-harness/blob/2b26690fe5d06551030031116d073656c8dc0f33/lm_eval/utils.py#L20-L24
Suggestion: Remove
basicConfig()
call and document that applications should configure logging.setLevel call
Two functions in
lm_eval.evaluator
calllogger.setLevel()
. It's problematic for two reasons. First, it makes it harder for applications to manage and control logging levels from a central location. It is common to have a command line option that sets all loggers to DEBUG (verbose mode) or to ERROR (quiet mode). Second, it's a performance problem.setLevel
acquires a global lock and evicts the logging framework's cache. The logging framework has to rebuild its entire hierarchy and inheritance tree.Suggestion: Remove the feature completely or at least make
setLevel
optional withverbosity: str | None = None
andif verbosity is not None: eval_logger.setLevel(getattr(logging, verbosity))
https://github.com/EleutherAI/lm-evaluation-harness/blob/2b26690fe5d06551030031116d073656c8dc0f33/lm_eval/evaluator.py#L135-L137