open-telemetry / opentelemetry-python

OpenTelemetry Python API and SDK
https://opentelemetry.io
Apache License 2.0
1.82k stars 636 forks source link

Logs SDK: `*LogRecordProcessor` logs exceptions via python logging leading to recursion #4323

Open lmolkova opened 1 week ago

lmolkova commented 1 week ago

If log exporter raises an exception, it fails the process with RecursionError: maximum recursion depth exceeded

class MyExporter(LogExporter):
    def export(self, datas: Sequence[LogData]):
        raise Exception("Some exception")

    def shutdown(self):
        pass

    def force_flush(self, timeout_millis: int):
        pass

logger_provider = LoggerProvider(resource=resource)
logger_provider.add_log_record_processor(SimpleLogRecordProcessor(MyExporter()))
set_logger_provider(logger_provider)

handler = LoggingHandler(level=logging.NOTSET, logger_provider=logger_provider)
logging.getLogger().addHandler(handler)
logger1 = logging.getLogger("myapp.area1")
logger1.warning("Hello, World!")

You'd see something like

  ....
  File "path/to\Lib\logging\__init__.py", line 1684, in _log
    self.handle(record)
  File "path/to\Lib\logging\__init__.py", line 1700, in handle
    self.callHandlers(record)
  File "path/to\Lib\logging\__init__.py", line 1762, in callHandlers
    hdlr.handle(record)
  File "path/to\Lib\logging\__init__.py", line 1028, in handle
    self.emit(record)
  File "C:\Users\limolkova\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\opentelemetry\sdk\_logs\_internal\__init__.py", line 580, in emit
    logger.emit(self._translate(record))
  File "C:\Users\limolkova\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\opentelemetry\sdk\_logs\_internal\__init__.py", line 621, in emit
    self._multi_log_record_processor.emit(log_data)
  File "path/to\opentelemetry\sdk\_logs\_internal\__init__.py", line 322, in emit
    lp.emit(log_data)
  File "path/to\opentelemetry\sdk\_logs\_internal\export\__init__.py", line 131, in emit
    _logger.exception("Exception while exporting logs.")
  File "path/to\Lib\logging\__init__.py", line 1574, in exception
    self.error(msg, *args, exc_info=exc_info, **kwargs)
  File "path/to\Lib\logging\__init__.py", line 1568, in error
    self._log(ERROR, msg, args, **kwargs)
  File "path/to\Lib\logging\__init__.py", line 1684, in _log
    self.handle(record)
  File "path/to\Lib\logging\__init__.py", line 1700, in handle
    self.callHandlers(record)
  File "path/to\Lib\logging\__init__.py", line 1762, in callHandlers
    hdlr.handle(record)
  File "path/to\Lib\logging\__init__.py", line 1028, in handle
    self.emit(record)
  File "path/to\opentelemetry\sdk\_logs\_internal\__init__.py", line 580, in emit
    logger.emit(self._translate(record))
  File "path/to\opentelemetry\sdk\_logs\_internal\__init__.py", line 621, in emit
    self._multi_log_record_processor.emit(log_data)
  File "path/to\opentelemetry\sdk\_logs\_internal\__init__.py", line 322, in emit
    lp.emit(log_data)
  File "path/to\opentelemetry\sdk\_logs\_internal\export\__init__.py", line 131, in emit
    _logger.exception("Exception while exporting logs.")
  File "path/to\Lib\logging\__init__.py", line 1574, in exception
    self.error(msg, *args, exc_info=exc_info, **kwargs)
  File "path/to\Lib\logging\__init__.py", line 1568, in error
    self._log(ERROR, msg, args, **kwargs)
  File "path/to\Lib\logging\__init__.py", line 1684, in _log
    self.handle(record)
  File "path/to\Lib\logging\__init__.py", line 1700, in handle
    self.callHandlers(record)
  File "path/to\Lib\logging\__init__.py", line 1762, in callHandlers
    hdlr.handle(record)
  File "path/to\Lib\logging\__init__.py", line 1028, in handle
    self.emit(record)
  File "path/to\opentelemetry\sdk\_logs\_internal\__init__.py", line 580, in emit
    logger.emit(self._translate(record))
  File "path/to\opentelemetry\sdk\_logs\_internal\__init__.py", line 621, in emit
    self._multi_log_record_processor.emit(log_data)
  File "path/to\opentelemetry\sdk\_logs\_internal\__init__.py", line 322, in emit
    lp.emit(log_data)
  File "path/to\opentelemetry\sdk\_logs\_internal\export\__init__.py", line 131, in emit
    _logger.exception("Exception while exporting logs.")
  File "path/to\Lib\logging\__init__.py", line 1574, in exception
    self.error(msg, *args, exc_info=exc_info, **kwargs)
  File "path/to\Lib\logging\__init__.py", line 1568, in error
    self._log(ERROR, msg, args, **kwargs)
  File "path/to\Lib\logging\__init__.py", line 1684, in _log
    self.handle(record)
  File "path/to\Lib\logging\__init__.py", line 1700, in handle
    self.callHandlers(record)
  File "path/to\Lib\logging\__init__.py", line 1762, in callHandlers
    hdlr.handle(record)
  File "path/to\Lib\logging\__init__.py", line 1028, in handle
    self.emit(record)
  File "path/to\opentelemetry\sdk\_logs\_internal\__init__.py", line 580, in emit
    logger.emit(self._translate(record))
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "path/to\opentelemetry\sdk\_logs\_internal\__init__.py", line 504, in _translate
    span_context = get_current_span().get_span_context()
                   ^^^^^^^^^^^^^^^^^^
  File "path/to\opentelemetry\trace\propagation\__init__.py", line 48, in get_current_span
    span = get_value(_SPAN_KEY, context=context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "path/to\opentelemetry\context\__init__.py", line 94, in get_value
    return context.get(key) if context is not None else get_current().get(key)
                                                        ^^^^^^^^^^^^^
RecursionError: maximum recursion depth exceeded

This issue is originally discovered in a natural case of dependency version conflict and mismatch between exporter and the SDK. While the exporter and/or app is to blame, we should handle it gracefully and avoid killing the app process.

The recommendation:

Related to https://github.com/open-telemetry/opentelemetry-python-contrib/issues/2813

Part of https://github.com/open-telemetry/community/issues/1751

aabmass commented 4 days ago

Yes, this problem keeps coming up. We have a "meta" bug here for it #4261 and I agree we need to rethink the approach like the recommendations you mentioned:

The recommendation:

  • use a different channel to report SDK issues (not python logging) and keep it off by default. AND/OR
  • throttle self-reporting logs when publishing them to python.logging (e.g. log once and, if throttling, add a message on how to enable internal logging)