aws-samples / amazon-textract-textractor

Analyze documents with Amazon Textract and generate output in multiple formats.
Apache License 2.0
360 stars 134 forks source link

Use module name for logger instead of Root Logger #367

Open michaelshum321 opened 1 month ago

michaelshum321 commented 1 month ago

Typically, it's best practice for Python logging to use logging.getLogger(__name__).

However, the ResponseParser simply does import logging and then logging.info(...) - this results in the root logger being used, as if the logger was logging.getLogger("root").

i.e. https://github.com/aws-samples/amazon-textract-textractor/blob/9df5d268dead3f42104cde2f766cb16be3f93d95/textractor/parsers/response_parser.py#L148

The logs created by the ResponseParser are many and spam our Server Logs. As a result, the only way to filter these logs is to apply a Logging Filter

class TextractFilter(logging.Filter):
    """
    Since Textract uses the root logger, we cannot set the logger level
    without affecting other usage of the root logger.
    With this filter, we are able to filter INFO logs from Textract.
    """

    def filter(self, record: logging.LogRecord):
        return not (record.module == "response_parser" and record.levelno == logging.INFO)

def configure_loggers():
    # This is for Textract to not spam the Server with INFO logs
    logging.getLogger("root").addFilter(TextractFilter())

Screen Shot 2024-05-20 at 2 01 31 PM

The Textract Filter works - but is generally not best practice when all I want to do is something like

logging.getLogger("textractor").setLevel(logging.WARNING)

Is there a better approach, or could we change the logger to use the module __name__ to be better configurable? thank you :)

JorgeMSL commented 5 hours ago

UP I need this same fix! I'm having exactly the same problem as described by @michaelshum321.

I make the same suggestion:

Is there a better approach, or could we change the logger to use the module name to be better configurable? thank you :)

Is it possible to implement it? Thanks :)