traceback filtering hook

pytestbot commented 11 years ago

Originally reported by: Ronny Pfannschmidt (BitBucket: RonnyPfannschmidt, GitHub: RonnyPfannschmidt)

we need a way to filter the bits of a traceback that are framework code since in normal development they seem to distract from the real problem

i propose a hook+ soma pis to be able to replace sets of traceback entries by synthetic frames that will just output as note that framework frames have been skipped

and of course, the --fulltrace option would opt out of it

Bitbucket: https://bitbucket.org/pytest-dev/pytest/issue/283

nicoddemus commented 9 years ago

@RonnyPfannschmidt, __tracebackhide__ isn't enough for this cases?

RonnyPfannschmidt commented 9 years ago

@nicoddemus no, since Foreign frameworks do not use it

nicoddemus commented 9 years ago

HMmm I see, thanks. Could you post a more detailed example, including how the hook would work?

RonnyPfannschmidt commented 9 years ago

Think of sqlalchemy, django, pyramid

When extending those, they tend to add a lot of extra stack that's usually irrelevant,

I'm on tablet ATM so I avoid code writeup, however I don't get have any nice idea for the api

To me its a post copy py.code item

ringods commented 6 years ago

Is this still something which needs to be implemented in pytest?

RonnyPfannschmidt commented 6 years ago

yup

nicoddemus commented 6 years ago

To get the discussion rolling, how about this signature for a hook:

@pytest.hookspec(firstresult=True)
def pytest_filter_traceback_entry(filename, lineno, locals):
    """When pytest generates a traceback, this hook will be called for each traceback entry. If any
    hook returns `True` for any entry, that entry and all entries below it will be hidden from the final
    traceback that will be displayed to the users, except if `-vv` is given on the command-line. 
    """

This hook receives locals instead of the code or traceback entry object because I don't think we should be exposing those internal objects to users.

ringods commented 6 years ago

@nicoddemus I have a custom pytest plugin to run mamba tests. In the spec tests, I also use the expects library. Below you see the output with a forced error in my test suite, using the --tb=short output format.

../../../../.pyenv/versions/3.6.4/envs/metis/lib/python3.6/site-packages/_pytest/runner.py:192: in __init__
    self.result = func()
../../../../.pyenv/versions/3.6.4/envs/metis/lib/python3.6/site-packages/_pytest/runner.py:178: in <lambda>
    return CallInfo(lambda: ihook(item=item, **kwds), when=when)
../../../../.pyenv/versions/3.6.4/envs/metis/lib/python3.6/site-packages/pluggy/__init__.py:617: in __call__
    return self._hookexec(self, self._nonwrappers + self._wrappers, kwargs)
../../../../.pyenv/versions/3.6.4/envs/metis/lib/python3.6/site-packages/pluggy/__init__.py:222: in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
../../../../.pyenv/versions/3.6.4/envs/metis/lib/python3.6/site-packages/pluggy/__init__.py:216: in <lambda>
    firstresult=hook.spec_opts.get('firstresult'),
../../../../.pyenv/versions/3.6.4/envs/metis/lib/python3.6/site-packages/pluggy/callers.py:201: in _multicall
    return outcome.get_result()
../../../../.pyenv/versions/3.6.4/envs/metis/lib/python3.6/site-packages/pluggy/callers.py:76: in get_result
    raise ex[1].with_traceback(ex[2])
../../../../.pyenv/versions/3.6.4/envs/metis/lib/python3.6/site-packages/pluggy/callers.py:180: in _multicall
    res = hook_impl.function(*args)
../../../../.pyenv/versions/3.6.4/envs/metis/lib/python3.6/site-packages/_pytest/runner.py:109: in pytest_runtest_call
    item.runtest()
python-mamba/pytest_mamba/plugin.py:134: in runtest
    raise mamba_error.exception
../../../../.pyenv/versions/3.6.4/envs/metis/lib/python3.6/site-packages/mamba/example.py:43: in _execute_test
    self.test(execution_context)
spec/action_base_spec.py:44: in 00000007__it is ok--
    expect(rs).to(equal('value2'))
../../../../.pyenv/versions/3.6.4/envs/metis/lib/python3.6/site-packages/expects/expectations.py:19: in to
    self._assert(matcher)
../../../../.pyenv/versions/3.6.4/envs/metis/lib/python3.6/site-packages/expects/expectations.py:25: in _assert
    raise AssertionError(self._failure_message(matcher, reasons))
E   AssertionError:
E   expected: 'value' to equal 'value2'

For this example, could you tell me how many times the callback will be triggered and which arguments I will get for the above?

What I would like to see left over from the traceback above after filtering:

spec/action_base_spec.py:44: in 00000007__it is ok--
    expect(rs).to(equal('value2'))
E   AssertionError:
E   expected: 'value' to equal 'value2'

So everything from pytest itself filtered and everything from the expects library filtered.

RonnyPfannschmidt commented 6 years ago

@nicoddemus such a hook will not allow to fold framework bits

nicoddemus commented 6 years ago

such a hook will not allow to fold framework bits

I thought this would do the trick (simplified of course):

def pytest_filter_traceback_entry(filename, lineno, locals):
    import sqlalchemy
    return filename.startswith(os.path.dirname(sqlalquemy.__file__))

Or even exclude everything from inside site-packages:

def pytest_filter_traceback_entry(filename, lineno, locals):
    import site
    for path in site.getsitepackages():
        if filename.startswith(path):
            return True

RonnyPfannschmidt commented 6 years ago

@nicoddemus something displaying the trace-backs will not know that a framework section is being filtered, we would also be unable to show it again if needed

RonnyPfannschmidt commented 6 years ago

@nicoddemus i understand why just throwing them away in our pretty limited painful terminal print only ux makes sense, but anything more interactive will just be really bad if its crippled by the limits of our current bad ux

nicoddemus commented 6 years ago

something displaying the trace-backs will not know that a framework section is being filtered, we would also be unable to show it again if needed

Not sure I understand why those limitations are a given based on the hook proposed, after all the code which uses the hook itself might hide the tracebacks and keep them around for other purposes if needed.

i understand why just throwing them away in our pretty limited painful terminal print only ux makes sense, but anything more interactive will just be really bad if its crippled by the limits of our current bad ux

As I said, we might not just throw them away, but might set an attribute "hidden" or something like that that each UX can deal with appropriately.

Another idea: it might make sense to return a category instead, for example "third-party" or "builtin". This way a richer UI (like pytest-html) might decide to use that information to collapse sections and/or show them in different colors.

It seems like you have some requirements, but they are only in your head. It suggest you write down those requirements or at least some vague ideas to help the discussion.

MichaelKorn commented 3 years ago

I had the same problem and hope my solution can serve to show an implementation approach or at least a workaround. During the component and integration tests sometimes it is necessary to wait for the expected state, I use the tenacity package (active waiting by repeating the check with a time limit). If a test fails, the tenacity module generates a lot of noise in the pytest output and I filtered this out with a hook:

@pytest.hookimpl(hookwrapper=True, trylast=True)
def pytest_runtest_makereport(item, call):
    outcome = yield
    report = outcome.get_result()

    if report.when != 'call':
        return

    # remove noise (stacktrace) from the tenacity module if there is an exception during waiting
    try:
        reprentries = [x for x in report.longrepr.chain[0][0].reprentries[0:-1] if
                       "tenacity/__init__.py" not in x.reprfileloc.path and
                       "concurrent/futures/_base.py" not in x.reprfileloc.path and
                       "attempt in Wait" not in x.lines[-1]
                       ]
        reprentries.append(report.longrepr.chain[0][0].reprentries[-1])  # make sure to keep the last element of the chain
        report.longrepr.chain[0][0].reprentries = reprentries
    except AttributeError:
        pass

jettero commented 2 years ago

I'm working with tensorflow, so without a good way to do this, I get like 19 pages of stuff I really don't care about and only a couple lines of things I do. I really want that long format output with the local variables and stuff. I just couldn't care less about the 19 billion lines of tensorflow's inner workings when I'm really just having a minor dimensionality problem...

I did something similar to @MichaelKorn I think, although, I arguably did it in the wrong place based on a stackoverflow comment.

# idea lifted from https://stackoverflow.com/a/56813896/388466
HERE = os.path.abspath(os.path.dirname(os.path.dirname(__file__)))

def pytest_exception_interact(node, call, report):
    def my_filter(tb_ent):
        if str(tb_ent.path).startswith(HERE):
            return True
        return False

    call.excinfo.traceback = call.excinfo.traceback.filter(my_filter)
    report.longrepr = call.excinfo.getrepr(tbfilter=True, style='long', funcargs=True, showlocals=True)

It'd be nice if there was a clean interface for this. I definitely understand the hesitation wrt to cutting out parts of the traceback. It's true there's no generalized way to show what developers actually want to see, but giving the developers some way to tell pytest what they want to see seems like a no brainer.

azmeuk commented 2 years ago

I think a whitelist strategy would be cheap and flexible enough. Set an option that lists the code directories for which the traceback should be displayed, and hide the rest. I think that by default most users will set their test directory and their library/webapp directory.

nicoddemus commented 2 years ago

I think a whitelist strategy would be cheap and flexible enough.

That's a great idea. 👍

We could have a new option for that, which is backwards compatible and probably simple to implement too.

brondsem commented 2 years ago

For reference, werkzeug's debugger differentiates library code from main/app code by checking sysconfig.get_paths(): https://github.com/pallets/werkzeug/blob/dbf78f674dda529e0ee5cb0fad7892a04e61bec1/src/werkzeug/debug/tbtools.py#L367

That maybe could be used here too, maybe as a default. User-configurable seems like a good idea too.

foxx commented 1 year ago

This could potentially be a good solution for reducing traceback noise in pytest when using decorators from upstream libraries, where you can't easily use __tracebackhide__. This is exascerbated by the lack of newlines between function calls on the stacktrace output, making it quite difficult to read.

It would be nice to have a way for filtering tracebacks based on arbitrary rules to make these less noisy, perhaps even a simple list of regex patterns to determine if a function should be removed from the stacktrace.

For example, when using beartype, you'll see the function is shown twice, once for the beartype decorator and once for the function method its wrapping. Although this isn't an upstream issue, I'll shamelessly /cc @leycec (beartype maintainer) for his exceptional knowledge (pun intended) on the internals of Python, as he may have already come across this with other exception handling libraries.

I'm not familiar enough with the internals of pytest to say how difficult this would be, but if a core dev could chime in with some hints that'd be super helpful. /cc @nicoddemus ?

<@beartype(XXX.Loader._coerce_complex_object) at 0x1e5d5420160>:30: in _coerce_complex_object 
    ???
        __beartype_args_len = 2
        __beartype_func = <function PSCliXmlLoader._coerce_complex_object at 0x000001E5D5420160>
        __beartype_get_violation = <function get_beartype_violation at 0x000001E5AE567910>
        __beartype_object_140715471122928 = <class 'xml.etree.ElementTree.Element'>
        __beartype_pith_0 = <Element 'Obj' at 0x000001E5D53EF1A0>
        args       = (<XXX.Loader object at 0x000001E5D540D720>, <Element 'Obj' at 0x000001E5D53EF1A0>)
        kwargs     = {}

XXX\XXX.py:92: in _coerce_complex_object
    tn_element = xml_find_exactly_one(element, './TN|./TNRef')
        child_nodes = {'DCT': [<Element 'DCT' at 0x000001E5D53EF920>], 'TN': [<Element 'TN' at 0x000001E5D53EF150>]}
        element    = <Element 'Obj' at 0x000001E5D53EF1A0>
        self       = <XXX.Loader object at 0x000001E5D540D720>
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _  

element = <Element 'Obj' at 0x000001E5D53EF1A0>, xpath = './TN|./TNRef'

    def xml_find_exactly_one(element: ET.Element, xpath: str):
        """."""
        found_elements = element.findall(xpath)
>       assert len(found_elements), f'expected 1 object, got {len(found_elements)}'
E       AssertionError: expected 1 object, got 0

element    = <Element 'Obj' at 0x000001E5D53EF1A0>
found_elements = []
xpath      = './TN|./TNRef'

XXX\XXX.py:25: AssertionError

leycec commented 1 year ago

@foxx: ...heh. No one will believe me, but @beartype's current traceback pollution is an improvement over the prior behaviour. We currently only add a single frame of pollution; we used to add about twenty billion frames of pollution, just like TensorFlow and company. Users openly wept back then.

pytest provides a hidden magical API enabling anyone to selectively exclude all calls to some callable from pytest tracebacks: __tracebackhide__. Thankfully, pytest implemented this API intelligently by inspecting the runtime call stack (rather than unintelligently performing static analysis with abstract syntax tree (AST) introspection). This means that decorators and other voluble frameworks like @beartype can exclude calls to arbitrary callables from pytest tracebacks by dynamically setting __tracebackhide__ = True in the bodies of those callables.

That said, doing so incurs a negligible performance penalty. We usually don't care about that, because this is Python. But we care about that in @beartype, because... @beartype. We really do measure and care about overhead denominated in microseconds. To reduce overhead, @beartype should avoid generating __tracebackhide__ = True assignments unless pytest is currently running tests. Specifically, @beartype will need to:

Reliably detect whether pytest is currently testing. Can this be done? To my limited knowledge, pytest provides no public API for deciding this. At least, it didn't use to. Please correct my stupidity, pytest devs, if indeed I am stupid.
If so, conditionally add __tracebackhide__ = True assignments to all type-checking wrapper functions generated by @beartype.

Theoretically, everybody else could do that too. If upstream packages can reliably detect when pytest is testing, this sorta becomes a shared upstream concern. I stroke my grizzled beard thoughtfully. :mage_man:

RonnyPfannschmidt commented 1 year ago

@leycec as the code also look at the globals in https://github.com/pytest-dev/pytest/blob/69f2855cc8896c67c499a250c2f08fcfd408f827/src/_pytest/_code/code.py#L287

beartype can probably just add the tracebackhide property to the globals of the check functions (presuming they go to a different module)

pytest currently doesn't provide a api to verify it is running (intentionally as people break the world with this all the time)

leycec commented 1 year ago

beartype can probably just add the tracebackhide property to the globals of the check functions

You... you're right! @beartype actually generates type-checking wrapper functions dynamically in-memory at runtime, which is how we hit ludicrous speeds. But I'm sure we can just add a fake '__tracebackhide__': True entry to the globals dict we pass when we generate those functions.

@RonnyPfannschmidt with the clever catch. I'm impressed and disappointed I didn't think of that. :sweat:

RonnyPfannschmidt commented 1 year ago

@leycec its not directly obvious from the code, its code i'm only familiar with since i want it gone since more than a decade now ;P

The-Compiler commented 1 year ago

For what it's worth, pytest does set a PYTEST_CURRENT_TEST environment variable, which you might be able to (ab)use to detect pytest running. Or you can check if pytest is in sys.modules, given that it's usually not imported outside of actually running tests.

pytest-dev / pytest

traceback filtering hook #283