python / cpython

The Python programming language
https://www.python.org
Other
62.95k stars 30.15k forks source link

logging should cache LogRecord.message #122089

Open SmartHypercube opened 2 months ago

SmartHypercube commented 2 months ago

Bug report

Bug description:

logging currently does not cache LogRecord.message: https://github.com/python/cpython/blob/0dcbc8385322ff51f7fc3e586027d880275df4fa/Lib/logging/__init__.py#L391-L401 https://github.com/python/cpython/blob/0dcbc8385322ff51f7fc3e586027d880275df4fa/Lib/logging/__init__.py#L711

Since logging caches other string-formatting related results such as LogRecord.asctime and LogRecord.exc_text, I would like to suggest:

  1. LogRecord.getMessage should check self.message first. If it is not computed yet, it should compute the message and set this attribute before returning.
  2. Formatter shouldn't set record.message. It only needs to call LogRecord.getMessage to get the message.

How to reproduce

import logging

class A:
    def __repr__(self):
        print('repr called')
        return '<A>'

logger = logging.getLogger()
logger.addHandler(logging.StreamHandler())
logger.addHandler(logging.StreamHandler())
logger.error('%r', A())

Expected output

repr called
<A>
<A>

Actual output

repr called
<A>
repr called
<A>

CPython versions tested on:

3.12

Operating systems tested on:

Linux

ZeroIntensity commented 2 months ago

This looks like a feature, not a bug. Is there any reason to cache this? str and % are not expensive operations.

SmartHypercube commented 2 months ago

I don't have strong opinion about whether this should be a bug report or a feature request. Please change it if needed.

https://docs.python.org/3/howto/logging-cookbook.html#implementing-structured-logging shows 2 examples involving expensive __str__ methods. That's actually very similar to my use case. Currently, each Handler in my program calls json.dumps on the same structure repeatedly. I do think it is wasteful and impacting latency.

ZeroIntensity commented 2 months ago

I'm wondering if this would be a breaking change -- could people be relying on getMessage() to call __str__ every time, or something along those lines?

Zheaoli commented 2 months ago

I think this is a breaking change, we may need to discuss it in https://discuss.python.org/