pytest-dev / pytest

The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
https://pytest.org
MIT License
12.18k stars 2.7k forks source link

Cannot use python logging without maintaining all records in caplog #8307

Open hilbichin opened 3 years ago

hilbichin commented 3 years ago

Description

I'd like to be able to use python's built-in logging, specifically to stream logging records to CLI and file, without having all log records stored in memory in the caplog fixture.

The reason I don't want the records to be stored in caplog is that this is essentially a forced memory leak. In the use-case where I discovered this, all DEBUG-level logging records were being stored persistently in caplog even though my CLI logging level was INFO, file handler level was DEBUG. The records were being streamed to file as expected, but I don't want them kept in memory. This was a very long-running test, the accumulating DEBUG records eventually resulted in a memory error.

Maybe I just don't know how to use caplog/logging in such a way that I can handle my logging streams as desired without keeping the log records in memory. If that's the case, would it be possible to get that clarification?

Environment


Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 23 2018, 23:31:17) [MSC v.1916 32 bit (Intel)]

system='Windows', release='10', version='10.0.16299', machine='AMD64', processor='Intel64 Family 6 Model 142 Stepping 10, GenuineIntel'

Package            Version
------------------ -------
atomicwrites       1.4.0
attrs              20.3.0
colorama           0.4.4
importlib-metadata 3.4.0
iniconfig          1.1.1
packaging          20.9
pip                20.3.3
pluggy             0.13.1
py                 1.10.0
pyparsing          2.4.7
pytest             6.2.2
setuptools         51.0.0
toml               0.10.2
typing-extensions  3.7.4.3
wheel              0.36.2
zipp               3.4.0

Example

import logging
logger = logging.getLogger(__name__)
test_logging_ex(caplog):
    assert not caplog.records
    logger.info("I'd like to see this message in the CLI, but I can't"
asottile commented 3 years ago

I believe the first line in the caplog documentation already addresses this: https://docs.pytest.org/en/stable/logging.html#caplog-fixture

Inside tests it is possible to change the log level for the captured log messages. This is supported by the caplog fixture:

hilbichin commented 3 years ago

I can control what gets captured by caplog. But, I cannot do this independently of other handlers to achieve the following:

This is easy outside of pytest, but with caplog I can't do it. Here is my setup:

In pytest.ini:

log_cli = true
log_cli_level = INFO
log_level = DEBUG

Result:

If I add a fixture that does:

log_names = [name for name in logging.root.manager.loggerDict]
for name in log_names:
    caplog.set_level(logging.INFO, logger=name)

Result:

hilbichin commented 3 years ago

Any update here? The ask is to be able to log debug logs to file, info logs to terminal, while not leaking memory by keeping every debug log in caplog.

SnufkinWasTaken commented 9 months ago

similar problem here - any update regarding this issue? ^^

lovetheguitar commented 8 months ago

To reduce pytest's memory consumption we currently settled for below approach. Maybe it helps someone. Feedback/improvements welcome.

def pytest_runtest_logreport(report: pytest.TestReport) -> None:
    """Drop captured logs of successful test phases to reduce pytest's excessive memory usage.

    .. warning :: No extended testing regarding possible side effects was done.

    This is a workaround for an intrinsic pytest memory leak, see relevant issues:
      - https://github.com/pytest-dev/pytest/issues/8307
      - https://github.com/pytest-dev/pytest/discussions/11704
      - https://github.com/pytest-dev/pytest/issues/3329
      - https://github.com/pytest-dev/pytest/issues/9215

    `prefix` taken from https://github.com/pytest-dev/pytest/blob/8a410d0ba60bca7073bf18a1503c59a3b5445780/src/_pytest/reports.py#L118-L125

    This hook gets called for every `pytest_runtest_protocol` phase, i.e. `setup`, `call`, `teardown`.

    Since we do not directly delete the private
    [`item._report_sections`](https://github.com/pytest-dev/pytest/blob/c967d508d62ad9e073d495ddfdca437188f2283e/src/_pytest/reports.py#L369-L370)
    to hopefully be "a bit safer" - the `report.sections` are appended for each consecutive hook
    call. This means that in later phases all the (log) sections of previous phases will be present
    again and we thus remove up to three sections without a warning.
    """  # noqa: E501  # allow long url
    # TODO: test if it would be sufficient to only drop the sections in the teardown phase
    if report.passed and report.caplog != "":
        prefix = "Captured log"

        sections_to_drop = []
        for section in report.sections:
            name = section[0]
            if name.startswith(prefix):
                sections_to_drop.append(section)

        if len(sections_to_drop) > 3:
            log.debug(
                f"Removing {sections_to_drop=} from captured logs of passed {report.nodeid=}."
            )
            warnings.warn(
                RuntimeWarning(
                    f"Dropping more than the maximum expected three sections "
                    f"({', '.join(section_name for section_name, _ in sections_to_drop)}) "
                    f"from the pytest report.sections object. This might be caused by "
                    f"other pytest plugins and might in turn cause problems for those."
                ),
                stacklevel=2,
            )

        for section_to_drop in sections_to_drop:
            report.sections.remove(section_to_drop)
ThermodynamicBeta commented 2 months ago

If I understand correctly, @lovetheguitar 's solution only works if you have multiple smaller tests, where setup, call and teardown will happen. In my use case, I have one long-running test that will eventually run out of memory due to the caplog. Is there any possible solution for that case?

RonnyPfannschmidt commented 2 months ago

At the sprint a plugin was started for mid test pruning

I didn't keep Tabs

lovetheguitar commented 2 months ago

@ThermodynamicBeta yes indeed, this hack above, as well as the plugin we started at the end of the sprint will only work for a test suite consisting of multiple tests. The plugin is local only still, but if people are interested that might motivate us to clean up things and get it out faster.

@RonnyPfannschmidt We used pytest_runtest_teardown, which in this scenario I‘d call "after test".

I don’t think there is much to expect from pytest or a plugin in the case of single long running test though. In that case no pytest hooks will be executed during the run, so whichever clean up solution would have to run concurrently with the test.

RonnyPfannschmidt commented 2 months ago

I recall discussing fixture to provide mid test culling at choose points

ThermodynamicBeta commented 3 days ago

I made a workaround with some help from some community members. Adding this to a test makes it so logging has a capped maximum effect on memory, with an obvious downside of only saving the last n=100 messages for the post-test summary.

from collections import deque
import logging

class DequeStringIO:
    def __init__(self, bound: int) -> None:
        self.buf = deque(maxlen=bound)

    def write(self, s: str) -> int:
        self.buf.append(s)
        return len(s)

    def getvalue(self) -> str:
        return "".join(self.buf)

for handler in logging.getLogger().handlers:
    if "LogCaptureHandler" in str(type(handler)):
        # there are two other handlers that don't appreciate having their stream overwritten
        handler.records = deque(maxlen=100)
        handler.stream = DequeStringIO(100)
The-Compiler commented 3 days ago

FWIW, instead of the hacky

    if "LogCaptureHandler" in str(type(handler)):

you might want to do a way more direct

    if isinstance(handler, _pytest.logging.LogCaptureHandler):

(yes, it's a private import - but you're already relying on private naming details to begin with!)

ThermodynamicBeta commented 3 days ago

I couldn't figure out how to import _pytest, how do you do that?

The-Compiler commented 3 days ago

Like any other module, the _ is just a naming convention: import _pytest.logging