zalando-zmon / opentracing-utils

Convenient utilities for adding OpenTracing support in your python projects
MIT License
20 stars 8 forks source link

break reference cycle in inspect_span_from_stack #44

Closed dneuhaeuser-zalando closed 3 years ago

dneuhaeuser-zalando commented 3 years ago

inspect_span_from_stack creates a reference cycle due to incorrect usage of inspect.currentframe (see the warning in the documentation https://docs.python.org/3/library/inspect.html#the-interpreter-stack).

This issue is inherited by other functions that call it directly or indirectly. Here is a script that gives you some indication of the impact:

import gc
from contextlib import contextmanager

from opentracing_utils import trace
from opentracing_utils.span import (
    get_new_span,
    get_parent_span,
    inspect_span_from_stack,
)

@trace()
def foo():
    pass

@contextmanager
def debug_gc():
    gc.collect()
    gc.set_debug(gc.DEBUG_STATS | gc.DEBUG_COLLECTABLE)
    try:
        yield
    finally:
        gc.collect()
        gc.set_debug(0)

def main():
    print("@trace " + "=" * 40)
    with debug_gc():
        foo()
    print("=======" + "=" * 40)

    print("get_new_span " + "=" * 40)
    with debug_gc():
        get_new_span(lambda: None, (), {})
    print("=============" + "=" * 40)

    print("get_parent_span " + "=" * 40)
    with debug_gc():
        get_parent_span()
    print("================" + "=" * 40)

    print("inspect_span_from_stack " + "=" * 40)
    with debug_gc():
        inspect_span_from_stack()
    print("========================" + "=" * 40)

if __name__ == "__main__":
    main()

Output:

@trace ========================================
gc: collecting generation 2...
gc: objects in each generation: 7 0 45420
gc: objects in permanent generation: 0
gc: collectable <dict 0x109e10e40>
gc: collectable <dict 0x10a74ec80>
gc: collectable <dict 0x109e11ec0>
gc: collectable <frame 0x10b084b80>
gc: collectable <frame 0x109789e40>
gc: collectable <frame 0x7faa93dd3350>
gc: collectable <frame 0x7faa93de0050>
gc: done, 7 unreachable, 0 uncollectable, 0.0079s elapsed
===============================================
get_new_span ========================================
gc: collecting generation 2...
gc: objects in each generation: 6 0 45419
gc: objects in permanent generation: 0
gc: collectable <function 0x10b07b310>
gc: collectable <dict 0x1097be280>
gc: collectable <dict 0x1097302c0>
gc: collectable <frame 0x10b084b80>
gc: collectable <frame 0x109789e40>
gc: collectable <frame 0x7faa93dd3350>
gc: done, 6 unreachable, 0 uncollectable, 0.0104s elapsed
=====================================================
get_parent_span ========================================
gc: collecting generation 2...
gc: objects in each generation: 3 0 45419
gc: objects in permanent generation: 0
gc: collectable <dict 0x109e10e40>
gc: collectable <frame 0x10b084b80>
gc: collectable <frame 0x109789e40>
gc: done, 3 unreachable, 0 uncollectable, 0.0099s elapsed
========================================================
inspect_span_from_stack ========================================
gc: collecting generation 2...
gc: objects in each generation: 1 0 45419
gc: objects in permanent generation: 0
gc: collectable <frame 0x10b084b80>
gc: done, 1 unreachable, 0 uncollectable, 0.0108s elapsed
================================================================

In real world applications we are observing 10k objects collected per minute, which leads to hundreds of GC collections per minute. This in turn has a significant impact on CPU usage and latency.

This PR changes inspect_span_from_stack so that it uses inspect.currentframe in the way recommended by the Python documentation. It also adds a test to verify that this function doesn't create reference cycles.

codecov[bot] commented 3 years ago

Codecov Report

Merging #44 (8d178a7) into master (45699b0) will increase coverage by 0.00%. The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #44   +/-   ##
=======================================
  Coverage   98.36%   98.37%           
=======================================
  Files           9        9           
  Lines         368      370    +2     
=======================================
+ Hits          362      364    +2     
  Misses          6        6           
Impacted Files Coverage Δ
opentracing_utils/span.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 45699b0...8d178a7. Read the comment docs.

dneuhaeuser-zalando commented 3 years ago

:+1:

mohabusama commented 3 years ago

👍

mohabusama commented 3 years ago

@vetinari Would be great if we can test and publish this fix.

dneuhaeuser-zalando commented 3 years ago

FYI We have patched opentracing_utils in one of our own applications and this is the impact:

Screenshot 2020-10-29 at 11 01 11
dneuhaeuser-zalando commented 3 years ago

@vetinari Could you take a look at this? We currently run a patched version and it would be nice to get rid of the patch.

pitr commented 3 years ago

👍

juheise commented 3 years ago

:+1:

vetinari commented 3 years ago

:+1:

pitr commented 3 years ago

v0.20.1 has been released - https://pypi.org/project/opentracing-utils/

Thanks you for following up on this.

dneuhaeuser-zalando commented 3 years ago

Thank You! :)