madzak / python-json-logger

Json Formatter for the standard python logger
BSD 2-Clause "Simplified" License
1.74k stars 231 forks source link

Question: Handling http library logging variables like from Flask or Gunicorn #78

Closed LasseGravesen closed 1 year ago

LasseGravesen commented 5 years ago

I have been using this excellent library for many different services where it works great. However, I've tried to use it for some API services that I can't get it to properly log the 'extra' variables usually associated with HTTP clients, such as status code, remote host, remote port, path and so forth. So for instance, this is an example of a log entry using the json-logging library:

{"type": "request", "written_at": "2019-04-10T08:59:49.054Z", "written_ts": 1554886789054874000, "component_id": "-", "component_name": "api", "component_instance": 0, "correlation_id": "xyz", "remote_user": "-", "request": "/api/v1/ping", "referer": "-", "protocol": "HTTP/1.0", "method": "GET", "remote_ip": "123.123.123.123", "request_size_b": -1, "remote_host": "123.123.123.123", "remote_port": "40167", "request_received_at": "2019-04-10T08:59:49.049Z", "response_time_ms": 5, "response_status": 200, "response_size_b": 15, "response_content_type": "application/json", "response_sent_at": "2019-04-10T08:59:49.054Z"}

Where as with the python-json-logger library, the log entries look more like this:

{"asctime": "2019-04-10 09:01:28,220", "name": "gunicorn.error", "levelname": "DEBUG", "message": "GET /api/v1/ping"}

I've tried to add logging filters to attach the request and response data to the logs, but they don't really show up. So for instance, like the below code, doesn't really appear to do much.

    class FilterAddContext(logging.Filter):
        def filter(self, record):
            if request:
                record.endpoint = request.endpoint or ""
                return True
            else:
                return True

I much prefer the python-json-logger library over the json-logging library, but this is a part I'm struggling to get working correctly. It works for all other logging purposes, where I just need to specify extras but I haven't been able to get this to work properly.

Hope you can provide some insights on how this should be done.

paymog commented 5 years ago

What's the line of code you're using to log the extras?

LasseGravesen commented 5 years ago

@paymog Not totally sure, what you mean by what line of code, but for instance, I have this call:

current_app.logger.info(
        "read_org",
        extra={
            "request_id": g.request_id,
            "org_id": org_id,
            "year": year,
            "month": month,
            "sum": month_sum,
        },
    )

That shows up in the logs as this:

{
    "asctime": "2019-06-05 07:45:15,424",
    "name": "app",
    "levelname": "INFO",
    "message": "read_org",
    "request_id": "61ce7391-0ff1-47b4-8795-244355a313be",
    "org_id": 103193,
    "year": "2019",
    "month": "06",
    "sum": 0
}

Where I would like more information, like:

{
    "type": "request",
    "written_at": "2019-04-10T08:59:49.054Z",
    "written_ts": 1554886789054874000,
    "component_id": "-",
    "component_name": "api",
    "component_instance": 0,
    "correlation_id": "xyz",
    "remote_user": "-",
    "request": "/api/v1/orgs/1",
    "referer": "-",
    "protocol": "HTTP/1.0",
    "method": "GET",
    "remote_ip": "123.123.123.123",
    "request_size_b": -1,
    "remote_host": "123.123.123.123",
    "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0",
    "remote_port": "40167",
    "request_received_at": "2019-04-10T08:59:49.049Z",
    "response_time_ms": 5,
    "response_status": 200,
    "response_size_b": 15,
    "response_content_type": "application/json",
    "response_sent_at": "2019-04-10T08:59:49.054Z",
    "message": "read_org",
    "request_id": "61ce7391-0ff1-47b4-8795-244355a313be",
    "org_id": 103193,
    "year": "2019",
    "month": "06",
    "sum": 0
}

I've tried to use filters like this to add request / response info to the log record, but they don't show up.

    class FilterAddContext(logging.Filter):
        def filter(self, record):
            if request:
                record.endpoint = request.endpoint or ""
                return True
            else:
                return True

So I can add extra fields when I do the logging, but I would like a way to do it as a filter that I can set up in the factory of the app like this:

def setup_logging(app):
    """Configure logging."""
    logging.config.fileConfig("logging.conf")

    app.logger = logging.getLogger("app")

    class FilterBasic(logging.Filter):
        def filter(self, record):
            if request:
                endpoint = request.endpoint or ""
                record.endpoint = request.endpoint or ""
                return "basic" not in endpoint
            else:
                msg = record.getMessage() or ""
                return "ping" not in msg

    app.logger.addFilter(FilterBasic())
    logging.getLogger("gunicorn.access").addFilter(FilterBasic())
paymog commented 5 years ago

Have you tried creating a custom formatter which adds fields like the one in https://github.com/madzak/python-json-logger#customizing-fields? In that example,

        else:
            log_record['level'] = record.levelname

adds a new field to the json that gets logged

LasseGravesen commented 5 years ago

I define my logging in a logging file, like this:

[loggers]
keys = root,app,gunicorn.error,gunicorn.access

[logger_root]
level = INFO
handlers = custom
qualname = root
propagate = 0

[logger_app]
level = DEBUG
handlers = custom
qualname = app
propagate = 0

[logger_gunicorn.access]
level = DEBUG
handlers = custom_gunicorn
qualname = gunicorn.access
propagate = 0

[logger_gunicorn.error]
level = INFO
handlers = custom_gunicorn
qualname = gunicorn.error
propagate = 0

[handlers]
keys = custom_gunicorn

[handler_custom_gunicorn]
class = StreamHandler
level = DEBUG
formatter = json_gunicorn
args = (sys.stdout,)

[formatters]
keys = json_gunicorn

[formatter_json_gunicorn]
class = pythonjsonlogger.jsonlogger.JsonFormatter
format="(asctime) (name) (levelname) (message)"

So I would need to define a custom formatter that would live somewhere in the app directory, then reference that class in the logging.conf file. I'll try if I can make that work but it doesn't seem like a great approach, if possible I would prefer a filter that can be called by the app factory and added dynamically, like above with FilterBasic.

LasseGravesen commented 5 years ago

I've created a repo where I've tried to experiment with a few things. Can't really seem to get any of them to work properly.

https://github.com/FalconSocial/example-api-log

xmo-odoo commented 4 years ago

@LasseGravesen json-logging has built-in integration / instrumentation with various web frameworks, which is where the additional information come from.

If you want that in json-logger, you'll have to replicate the extraction of the request / response information onto the log record (or find an existing library which does this for you), as json-logger does not itself provide any special integration with external packages.