Teradata / PyTd

A Python Module to make it easy to script powerful interactions with Teradata Database in a DevOps friendly way.
MIT License
108 stars 43 forks source link

configureLogging = True sets log level at root logger level #38

Closed ShaunCurrier closed 8 years ago

ShaunCurrier commented 8 years ago

The following linked code, at least, is blocking log records from other parts of my application from making it through the logging machinery. https://github.com/Teradata/PyTd/blob/master/teradata/udaexec.py#L253

I'm not well-versed in architecting logging but it seems like the teradata module should not be setting root logger properties since that will affect the logging of other parts of the application. Not sure though.

escheie commented 8 years ago

Setting the root log level was intentional. The idea was that if an application created its own loggers, then the log level configured through the external configuration would also set the log level for the other application loggers, not just the ones created by PyTd.

It should not be blocking other log messages though. The default root logger level is WARNING, so if PyTd module sets it to INFO by default, this should potentially cause more logging than the default. The root logger level is the default log level, if the level is set explicitly on another logger, that level overrides the root logger level for that logger and all its decedents.

I think the problem here is that the same level is also being set on the StreamHandler (https://github.com/Teradata/PyTd/blob/master/teradata/udaexec.py#L251) and FileHandler (https://github.com/Teradata/PyTd/blob/master/teradata/udaexec.py#L248) PyTd creates to log to the console and to a file. These levels should instead be NOTSET so that they inherit whatever the log levels are set on the loggers. Otherwise, if the application log level is INFO but a specific logger is set to DEBUG, the DEBUG messages would not appear in the console or to the file.

Logger.setLevel(lvl) Sets the threshold for this logger to lvl. Logging messages which are less severe than lvl will be ignored. When a logger is created, the level is set to NOTSET (which causes all messages to be processed when the logger is the root logger, or delegation to the parent when the logger is a non-root logger). Note that the root logger is created with level WARNING.

The term ‘delegation to the parent’ means that if a logger has a level of NOTSET, its chain of ancestor loggers is traversed until either an ancestor with a level other than NOTSET is found, or the root is reached.

If an ancestor is found with a level other than NOTSET, then that ancestor’s level is treated as the effective level of the logger where the ancestor search began, and is used to determine how a logging event is handled.

If the root is reached, and it has a level of NOTSET, then all messages will be processed. Otherwise, the root’s level will be used as the effective level.

ShaunCurrier commented 8 years ago

The idea was that if an application created its own loggers, then the log level configured through the external configuration would also set the log level for the other application loggers, not just the ones created by PyTd.

Could you clarify this idea for me? What does "external configuration" refer to? The logging parameters set by the rest of the application?

Maybe a better way to ask it is this: If I want to limit only the PyTd module's log records to WARNING and above but let everything else in the application be handled without interference, how would that be done? It's not clear to me if that is intended to be possible from your last comment.

ShaunCurrier commented 8 years ago

I've just tried commenting out the two lines suggested: StreamHandler (https://github.com/Teradata/PyTd/blob/master/teradata/udaexec.py#L251) and FileHandler (https://github.com/Teradata/PyTd/blob/master/teradata/udaexec.py#L248) ...and found that I did not get the desired behavior. logLevel was set to WARNING and configureLogging True, and even with the lines commented out, logging of other module's INFO messages to console were being blocked.

Here's my guess at what is desired. I simply replaced references to the root logger with references to PyTd's logger:

def _initLogging(self, logDir, logFile, logConsole, level, logRetention,
                 logMsgs):
    """Initialize UdaExec logging"""
    if not os.path.exists(logDir):
        os.makedirs(logDir)
    self._cleanupLogs(logDir, logRetention, logMsgs)
    self.logDir = os.path.realpath(logDir)
    self.logFile = os.path.join(self.logDir, logFile)
    formatter = logging.Formatter(
        "%(asctime)s - %(name)s - %(levelname)s - %(message)s")
    fh = logging.FileHandler(self.logFile, mode="a", encoding="utf8")
    fh.setFormatter(formatter)
    fh.setLevel(level)
    sh = logging.StreamHandler(sys.stdout)
    sh.setFormatter(formatter)
    sh.setLevel(level)
    logger.setLevel(level)
    logger.addHandler(fh)
    if logConsole:
        logger.addHandler(sh)
    sys.excepthook = handleUncaughtException

This keeps PyTd from affecting logging from elsewhere in the app because it doesn't install the handlers on the root logger. Instead, it installs them on PyTd's logger. I believe this means that all PyTd messages will be "handled" (in the logging sense) identically to how they were before, except at the proper logging hierarchy level (PyTd's level) rather than the root logger level. I really think this is the preferred paradigm, but I admit I am no logging expert.

I ran this code and got the desired behavior in the one scenario I tested.

ShaunCurrier commented 8 years ago

Perhaps in my above code, the logger should be the teradata logger, not the teradata.udaexec logger, so that it would affect the teradata.tdodbc logger as well.

I'm also suspecting that logLevel should be set regardless of whether or not configureLogging is true. This would enable the fix for #37 to be utilized even without configureLogging = True.

escheie commented 8 years ago

If your having UdaExec configuring the logging subsystem (e.g. log to console, log to file), then why should it not configure the log level for the root logger? The root log level is WARNING by default. If UdaExec does not configure the root log level, then any other application specific loggers, unless independently configured would also only log at WARNING level.

The most common use for using PyTd is to write SQL scripts to perform some functionality. If a user defines a logger in their SQL Script, then they also want those log messages to appear in the logs. Setting the log level on the root logger enables this to happen.

If you or another import of your program is configuring logging, then just set configureLogging = False. That was the purpose of providing a way to disable UdaExec auto configuration of logging.

If your use case is to set the root log level to one value but the teradata loggers to another, then you can simply call logging.getLogger("teradata").setLevel(logging.XXXXXX) to set the log level of the teradata loggers to whatever you want.

I'm removing the calls to setLevel on the log handlers. I'm also going to add a check if loglevel == logging.NOTSET, and not call set logger on the root logger. This way, you can still have UdaExec create the console and file log handlers but not change the root log level.