census-instrumentation / opencensus-python

A stats collection and distributed tracing framework
Apache License 2.0
669 stars 250 forks source link

AsyncTransport marks main thread as "is_exporter" disabling integrations like "requests" #806

Open aberres opened 5 years ago

aberres commented 5 years ago

Issue description

I stumbled upon this bug when trying to use the requests integration in combination with synchronous gunicorn workers.

After a lot of head scratching I noticed that the call function inside opencensus.ext.requests.trace.wrap_requests always runs into this branch:

def wrap_requests(requests_func):
    """Wrap the requests function to trace it."""
    def call(url, *args, **kwargs):
        # Check if request was sent from an exporter. If so, do not wrap.
        if execution_context.is_exporter():
            return requests_func(url, *args, **kwargs)

       [....]

The function thinks uns in the context of ann exporter thread - but it does not.

Further investigation revealed that the appearance of the bug it is related to asynchronous transports. See the following minimal snippet which can be executed as is:

from flask import Flask
from opencensus.common.transports.async_ import AsyncTransport
from opencensus.ext.stackdriver.trace_exporter import StackdriverExporter

app = Flask(__name__)

exporter = StackdriverExporter(
    # Enable or disable to trigger the assert below
    transport=AsyncTransport
)

import opencensus.common.runtime_context as runtime_context
assert not runtime_context.RuntimeContext.is_exporter

The assertion in the last line always fails when an AsyncTransport is used.

Checking opencensus.common.transports.async_._Worker.start reveals that opencensus.trace.execution_context.set_is_exporter is called in the main thread, not in the to be started one.

I guess the simple solution is to move the call into opencensus.common.transports.async_._Worker._thread_main.

Additional information

Interestingly the issue only surfaces with the gunicorn sync worker but not with guicorn gevent workers or flask run. Does not really make sense to me, but I had no time to further investigate yet.

Used versions

opencensus-context==0.1.1
opencensus-ext-flask==0.7.3
opencensus-ext-gevent==0.1.0
opencensus-ext-requests==0.7.2
opencensus-ext-sqlalchemy==0.1.2
opencensus-ext-stackdriver==0.7.2
opencensus-ext-zipkin==0.2.2
opencensus==0.7.5
IamJeffG commented 5 years ago

Interestingly the issue only surfaces with the gunicorn sync worker but not with guicorn gevent workers or flask run.

I see this exact same behavior: The requests integration traces correctly when I run my service locally using flask run, but it does not trace when the deployment is running gunicorn.

aberres commented 5 years ago

@IamJeffG If you want a quick workaround the following helps after setting up the AsyncTransport

import opencensus.common.runtime_context as runtime_context
runtime_context.RuntimeContext.is_exporter = False

If "fixes" the current thread but does not affect the already forked exporter thread.

I was planning to provide a pull request for this issue but did not yet find the time.