EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.
https://www.eleuther.ai
MIT License
6.34k stars 1.68k forks source link

Follow logging best practices for libraries #2102

Open tiran opened 1 month ago

tiran commented 1 month ago

lm-eval does not follow best practices for logging in libraries. This makes it harder to use lm-eval in applications that have their own opinions on logging configuration.

If you are interested, I can create a PR to address the problems.

logging.basicConfig

The library calls logging.basicConfig at import time. Libraries should not configure logging themselves. It's the job of the application or main script to configure logging and set up log handlers according to its needs. basicConfig is a one-shot API. The second and any following call are a no-op, unless it is called with force=True.

https://github.com/EleutherAI/lm-evaluation-harness/blob/2b26690fe5d06551030031116d073656c8dc0f33/lm_eval/utils.py#L20-L24

Suggestion: Remove basicConfig() call and document that applications should configure logging.

setLevel call

Two functions in lm_eval.evaluator call logger.setLevel(). It's problematic for two reasons. First, it makes it harder for applications to manage and control logging levels from a central location. It is common to have a command line option that sets all loggers to DEBUG (verbose mode) or to ERROR (quiet mode). Second, it's a performance problem. setLevel acquires a global lock and evicts the logging framework's cache. The logging framework has to rebuild its entire hierarchy and inheritance tree.

Suggestion: Remove the feature completely or at least make setLevel optional with verbosity: str | None = None and if verbosity is not None: eval_logger.setLevel(getattr(logging, verbosity))

https://github.com/EleutherAI/lm-evaluation-harness/blob/2b26690fe5d06551030031116d073656c8dc0f33/lm_eval/evaluator.py#L135-L137

haileyschoelkopf commented 1 month ago

cc @lintangsutawika for comment as well, but would be glad to accept these changes I think!

Thanks for bearing with us and for your interest--the Harness has grown from being a smaller research codebase to a now much more widely used library over time.

lintangsutawika commented 1 month ago

Thanks, I would love to have such a PR and learn about logging best practices!

tiran commented 1 month ago

For libraries, it is super easy. If you need a logger in a model, simply do:

logger = logging.getLogger(__name__)

and you are all set up. You should not import a logger from a central location, pass a logger as argument, configure logging, or do any other hacks with a logger.

The logging framework recognizes dotted Python names and creates a hierarchy from Python names. In your case, the logger from logging.getLogger("lm_eval.evaluator") inherits settings like log level and handlers from its auto-generated parent logging.getLogger("lm_eval"). The parent gets its settings from the root logger logging.getLogger().

When an application follows this practice, then an application can use predicable names to configure loggers:

import logging

logging.getLogger().setLevel(logging.ERROR)
logging.getLogger("lm_eval.evaluator").setLevel(logging.DEBUG)

import lm_eval.evaluator

This sets all logging to ERROR (including "lm_eval") and sets "lm_eval.evaluator" to DEBUG although the module was not imported at the point in time.

LSinev commented 1 month ago

Do these practices retain abilities of lm_eval when used as standalone application: default logging config and verbosity level control? If no, what will be proposed solution for backward compatibility of usage practice (some sort of setup under if __name__ = "main" conditional maybe)?

lintangsutawika commented 4 weeks ago

@tiran I started a PR relevant to this. But I'm still unsure how to allow logging to also function like before.

lintangsutawika commented 4 weeks ago

I think I got it. Would still appreciate your thoughts on it.