Capturing logging output within the report

numirias / pytest-json-report

🗒️ A pytest plugin to report test results as JSON

MIT License

146 stars 39 forks source link

Capturing logging output within the report #6

Closed rmcgregor1990 closed 6 years ago

rmcgregor1990 commented 6 years ago

Hey,

I would like the ability include python stdlib logging output in the report under test/phase they were produced. This would be useful regardless of its success or failure.

I could add this functionality myself if you added some more granular hooks for modifying the report around each test run/phase. Alternatively I could technically implement it with the current API by managing my own logging output tree and merging it with the report at the end within the pytest_json_modifyreport hook, is this how you intended it to be used?

My end goal here would be to essentially integrate the structlog and its JSON log output format with your report structure for my own uses.

Cheers, Rob

numirias commented 6 years ago

Thanks for reporting!

The intended way to add metadata to a test item is via the json_metadata fixture.

E.g., you could use an autouse fixture which automatically surrounds every test call:

import pytest

@pytest.fixture(autouse=True)
def my_fixture(json_metadata, request):
    json_metadata['started'] = request.node.nodeid
    yield
    json_metadata['finished'] = request.node.nodeid

def test_foo():
    assert 1 + 1 == 2

This would give you a test item like:

{
    "nodeid": "test_foo.py::test_foo",
    "path": "test_foo.py",
    "outcome": "passed",
    ...
    "setup": {
        "duration": 0.001293182373046875,
        "outcome": "passed",
    },
    "call": {
        "duration": 0.00014352798461914062,
        "outcome": "passed"
    },
    "teardown": {
        "duration": 0.0002033710479736328,
        "outcome": "passed"
    },
    "metadata": {
        "started": "test_foo.py::test_foo",
        "finished": "test_foo.py::test_foo"
    }
}

The same way you could setup and tear down a logger. Would that work for you?

rmcgregor1990 commented 6 years ago

Thanks for the quick response,

I'm also going to want to record the logging output produced by setup/teardown of all fixtures used by a test. I don't think you can control fixture execution order, so I'm probably going to have to do the work at the plugin level.

My first approximation would be something like this:

import pytest
from contextlib import contextmanager
import logging
from collections import defaultdict

class LoggingHandler(logging.Handler):
    def __init__(self):
        super().__init__()
        self.captured_logs = []

    def emit(self, record):
        self.captured_logs.append(self.format(record))

class CapturePlugin(object):
    def __init__(self):
        self.captured_logs = defaultdict(dict)

    @pytest.mark.hookwrapper
    def pytest_runtest_setup(self, item):
        with self._capture_logs(item, 'setup'):
            yield

    @pytest.mark.hookwrapper
    def pytest_runtest_call(self, item):
        with self._capture_logs(item, 'call'):
            yield

    @pytest.mark.hookwrapper
    def pytest_runtest_teardown(self, item):
        with self._capture_logs(item, 'teardown'):
            yield

    @contextmanager
    def _capture_logs(self, item, phase):
        handler = LoggingHandler()
        logging.getLogger().addHandler(handler)
        try:
            yield
        finally:
            logging.getLogger().removeHandler(handler)
        self.captured_logs[item.nodeid][phase] = handler.captured_logs

    def pytest_json_modifyreport(self, json_report):
        # merge the logs into the report here...
        json_report["captured_logs"] = self.captured_logs

def pytest_configure(config):
    plugin = CapturePlugin()
    config._log_capture = plugin
    config.pluginmanager.register(plugin)

As you can see it could be slightly nicer to have a hook for extending your reports test nodes as they are created.

Honestly I always find myself requiring all the log output saved when dealing with test systems (I mainly deal with tests which drive external hardware so may remote things can go wrong), would be something you would consider supporting natively?

numirias commented 6 years ago

Thanks for the elaboration.

As you can see it could be slightly nicer to have a hook for extending your reports test nodes as they are created.

Would a simple pytest_json_modifytest / _modifystage hook be sufficient here?

would be something you would consider supporting natively?

Generally, yes. Log captures are definitely a common use case. That said, I'm not entirely sure about the implementation. Pytest uses its own logging handlers which I'd want to take advantage of, if possible. I'd happily take PRs or suggestions.

rmcgregor1990 commented 6 years ago

Since this is really the only extra feature I'd want before using your package more widely, I'd be glad to help.

Pytest has recently integrated pytest-catchlog into its builtin plugins (https://docs.pytest.org/en/latest/logging.html) so pretty much all logging configuration is now handled pretty nicely by the framework.

I'm not sure why we would need to mess with any of their logging handlers, any handlers attached to the root logger will receive the same information. There is a question of assigning a formatter through and if you want to store the logs a opaque strings or as more structured JSON objects.

In regards to the additional modify test item hook, maybe I'm being overly pedantic. There's something to be said for keeping the interface simple and its not really that challenging to merge within the pytest_json_modifyreport hook. I think you would probably need to think of a few more use cases before implementing such a thing.

numirias commented 6 years ago

I'm not sure why we would need to mess with any of their logging handlers, any handlers attached to the root logger will receive the same information.

Yeah, you're right. I had expected that by taking the root logger you'd bypass settings such as the chosen --log-level, but I figured that's not the case.

There is a question of assigning a formatter through and if you want to store the logs a opaque strings or as more structured JSON objects.

I'm strongly in favor of preserving the details in a dict (with timestamp, level, etc.). The disadvantage is that the user has to apply a formatter themself to get a text stream from the JSON log, but that seems fine.

There's something to be said for keeping the interface simple and its not really that challenging to merge within the pytest_json_modifyreport hook.

Agreed. Well, I see the use cases, but adding lots of hooks for modifying different parts of the report seems a bit awkward.

I'll let you know when I got logging implemented, and you're welcome to comment on that.

numirias commented 6 years ago

I implemented log capturing based on your approach.

There's now a log list on each test stage that has captured logs. You can also call logging.makeLogRecord() on an entry to get back a proper LogRecord object.

Let me know what you think.

rmcgregor1990 commented 6 years ago

I pretty much copied my approach from the internals of the pytest-catchlog package, so I can't take all the credit.

I'm had play with your branch, looks great overall, I just have a few questions.

I like that you have preserved all the information about the log capture in a structured form. However this is going to massively bloat and effect the human readability of your produced reports. I've seem tests that can produce hundreds of logs in just their setup phase, this is going to dominate everything else.

I think a hook where you can transform a LogRecord into a JSON object of your choice could be quite useful with your current implementation being the default. Something like pytest_json_logtojson or pytest_json_logjsonserializer maybe. I appreciate this would cause you loose the ability to reconstruct the LogRecord Object.

Actually it seems using a custom formatter to make the stdlib logging output JSON instead of opaque strings is not a new idea: https://github.com/madzak/python-json-logger However format is still expected to return string (even if that contains encoded JSON), so its probably not appropriate here.

You only capture stdout and logging in the case a failed test phase. I would like an option to capture everything regardless of outcome. There is still useful information in the output of a successful test, such as any form of subsystem profiling.

numirias commented 6 years ago

Thanks for taking the time to review!

I think a hook where you can transform a LogRecord into a JSON object of your choice could be quite useful with your current implementation being the default.

Alternatively, I was thinking about an option to switch between the dict or a specified formatted string representation. But if you need more control, I'm fine with adding a hook, too.

I was also pondering if there is an elegant way to specify a generalized hook for modifying different sections of the report along the lines of pytest_json_modify(section, object) where the passed section can be "testitem", "teststage", "logitem", etc. Not sure if that's bad design though.

You only capture stdout and logging in the case a failed test phase.

I didn't plan to exclude logging or stdout from successful tests. Could give me a quick test case? I'm currently unable to reproduce that.

rmcgregor1990 commented 6 years ago

Turns out I can't reproduce it either, sorry about that, not sure what led me to that conclusion. Seems to capture everything I would want.

Its difficult to get a more granular hook system right. Multiplexing all the sections onto a single hook is fine. However you want to make sure the user has access all the relevant context about the section object's position within report as well as object itself. I think this should probably be discussed in a separate issue, where we can outline the potential use cases for such hooks.

I think the current logging branch is ok to merge with the appropriate update in documentation. A command line argument to override to the JSON log output with a logging format string could be useful, but I'd to wait for someone to actually request it before implementing it.

The only situation where I'd want to remove structured information in favor of a opaque string is one where a human was viewing the JSON reports. In such a case you should probably be using pytest-html anyway.