open-telemetry / opentelemetry-python-contrib

OpenTelemetry instrumentation for Python modules
https://opentelemetry.io
Apache License 2.0
664 stars 547 forks source link

Tracing from the grequests module doesn't have the correct context information #548

Open kamalhussain opened 3 years ago

kamalhussain commented 3 years ago

Describe your environment The grequests module is a wrapper around the requests module to parallelize requests. While using this module, the trace of the HTTP requests are not getting associated with the parent context. I tried with 4 parallel requests and all of them have different trace ids.

Steps to reproduce Please see the following code snippet to reproduce the problem.

import grequests

from opentelemetry import trace
from opentelemetry.instrumentation.requests import RequestsInstrumentor
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import (
    ConsoleSpanExporter,
    SimpleSpanProcessor,
)
from opentelemetry.sdk.trace import sampling

tracer_provider = TracerProvider()
tracer_provider.sampler = sampling.ParentBased(sampling.ALWAYS_ON)

trace.set_tracer_provider(tracer_provider)
trace.get_tracer_provider().add_span_processor(
    SimpleSpanProcessor(ConsoleSpanExporter())
)

RequestsInstrumentor().instrument()

urls = [
    'http://www.heroku.com',
    'http://www.facebook.com',
    'http://www.instagram.com',
    'http://www.google.com',
]

def hello():
    tracer = trace.get_tracer(__name__)
    with tracer.start_as_current_span("example-request"):
        rs = (grequests.get(u) for u in urls)
        print(grequests.map(rs))

hello()

What is the expected behavior? Since requests are created within the parent's context, I expect to see all requests to have the same trace id.

What is the actual behavior?

{
    "name": "HTTP GET",
    "context": {
        "trace_id": "0x4b1ec3b31c6450c0be8964c551b44c08",
        "span_id": "0x29c0d897f72a984a",
        "trace_state": "[]"
    },
    "kind": "SpanKind.CLIENT",
    "parent_id": null,
    "start_time": "2021-06-23T19:21:42.221516Z",
    "end_time": "2021-06-23T19:21:42.485323Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "http.method": "GET",
        "http.url": "http://www.google.com",
        "http.status_code": 200,
        "http.status_text": "OK"
    },
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "1.0.0",
        "service.name": "unknown_service"
    }
}
{
    "name": "HTTP GET",
    "context": {
        "trace_id": "0x69ca27f26d5c8b03f8d55be73cb48fc4",
        "span_id": "0x56115bec426ad4af",
        "trace_state": "[]"
    },
    "kind": "SpanKind.CLIENT",
    "parent_id": null,
    "start_time": "2021-06-23T19:21:42.219740Z",
    "end_time": "2021-06-23T19:21:43.040657Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "http.method": "GET",
        "http.url": "http://www.facebook.com",
        "http.status_code": 200,
        "http.status_text": "OK"
    },
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "1.0.0",
        "service.name": "unknown_service"
    }
}
{
    "name": "HTTP GET",
    "context": {
        "trace_id": "0x2276dda25bd767a001cd3e1c6c978684",
        "span_id": "0xcc7f68b04b3a6533",
        "trace_state": "[]"
    },
    "kind": "SpanKind.CLIENT",
    "parent_id": null,
    "start_time": "2021-06-23T19:21:42.220744Z",
    "end_time": "2021-06-23T19:21:43.443545Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "http.method": "GET",
        "http.url": "http://www.instagram.com",
        "http.status_code": 200,
        "http.status_text": "OK"
    },
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "1.0.0",
        "service.name": "unknown_service"
    }
}
{
    "name": "HTTP GET",
    "context": {
        "trace_id": "0x990ec0f55354eedd5fc20d44419d45f9",
        "span_id": "0x048729577c83d30b",
        "trace_state": "[]"
    },
    "kind": "SpanKind.CLIENT",
    "parent_id": null,
    "start_time": "2021-06-23T19:21:42.205409Z",
    "end_time": "2021-06-23T19:21:43.943165Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "http.method": "GET",
        "http.url": "http://www.heroku.com",
        "http.status_code": 200,
        "http.status_text": "OK"
    },
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "1.0.0",
        "service.name": "unknown_service"
    }
}
[<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]
{
    "name": "example-request",
    "context": {
        "trace_id": "0x8c0e7cfeb2f35ef8ac583f5ca4177b44",
        "span_id": "0x3098a0cd8e7eea49",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": null,
    "start_time": "2021-06-23T19:21:42.196124Z",
    "end_time": "2021-06-23T19:21:43.946074Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {},
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "1.0.0",
        "service.name": "unknown_service"
    }
}

Additional context Add any other context about the problem here.

srikanthccv commented 3 years ago

I suspect it's the gevent; which works well with thread locals but I don't know it's behaviour contextvars.

codeboten commented 3 years ago

So my understanding is that gevent (at least recent versions of it) do support contextvars by leveraging greenlets, but each greenlet is spawned with an empty context according to the docs:

A new greenlet’s context is initially empty, i.e., all ContextVars have their default values. This matches the behavior of a new thread, but differs from that of a new asyncio.Task, which inherits a copy of the context that was active when it was spawned

Some of the examples do show there's ways to get the context copied into the greenlet, but in the few tests that I ran I was unable to make this work. Will need some more investigation.

kamalhussain commented 3 years ago

@codeboten

I tried with ThreadPoolExecutor but see the same problem.

from opentelemetry import trace
from opentelemetry.instrumentation.requests import RequestsInstrumentor
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import (
    ConsoleSpanExporter,
    SimpleSpanProcessor,
)
from opentelemetry.sdk.trace import sampling

tracer_provider = TracerProvider()
tracer_provider.sampler = sampling.ParentBased(sampling.ALWAYS_ON)

trace.set_tracer_provider(tracer_provider)
trace.get_tracer_provider().add_span_processor(
    SimpleSpanProcessor(ConsoleSpanExporter())
)

RequestsInstrumentor().instrument()

urls = [
    'http://www.heroku.com',
    'http://www.facebook.com',
    'http://www.instagram.com',
    'http://www.google.com',
]

import concurrent.futures
import requests

def get_url(url):
    print(requests.get(url))

def hello():
    tracer = trace.get_tracer(__name__)
    with tracer.start_as_current_span("example-request"):
        with concurrent.futures.ThreadPoolExecutor() as pool:
            pool.map(get_url, urls)

hello()

Results:

python3 simple-req-grequests.py
{
    "name": "HTTP GET",
    "context": {
        "trace_id": "0x0c15959f35307ba822a99b08b29a8155",
        "span_id": "0x2de455fe4b74b721",
        "trace_state": "[]"
    },
    "kind": "SpanKind.CLIENT",
    "parent_id": null,
    "start_time": "2021-07-06T15:58:18.836940Z",
    "end_time": "2021-07-06T15:58:19.120136Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "http.method": "GET",
        "http.url": "http://www.google.com",
        "http.status_code": 200,
        "http.status_text": "OK"
    },
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "1.0.0",
        "service.name": "unknown_service"
    }
}
<Response [200]>
{
    "name": "HTTP GET",
    "context": {
        "trace_id": "0xad097925f370cade16dbcfc50b1b5456",
        "span_id": "0x3d0eb44182ea468f",
        "trace_state": "[]"
    },
    "kind": "SpanKind.CLIENT",
    "parent_id": null,
    "start_time": "2021-07-06T15:58:18.836343Z",
    "end_time": "2021-07-06T15:58:19.439740Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "http.method": "GET",
        "http.url": "http://www.instagram.com",
        "http.status_code": 200,
        "http.status_text": "OK"
    },
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "1.0.0",
        "service.name": "unknown_service"
    }
}
<Response [200]>
{
    "name": "HTTP GET",
    "context": {
        "trace_id": "0xadc164e925db3b74e213ce3cad6c3da1",
        "span_id": "0x30deeefdca704b4e",
        "trace_state": "[]"
    },
    "kind": "SpanKind.CLIENT",
    "parent_id": null,
    "start_time": "2021-07-06T15:58:18.827078Z",
    "end_time": "2021-07-06T15:58:19.611694Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "http.method": "GET",
        "http.url": "http://www.heroku.com",
        "http.status_code": 200,
        "http.status_text": "OK"
    },
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "1.0.0",
        "service.name": "unknown_service"
    }
}
<Response [200]>
{
    "name": "HTTP GET",
    "context": {
        "trace_id": "0xf5ada71bdad610a62dbf1d5178df9a7c",
        "span_id": "0xef12c6817ce62e72",
        "trace_state": "[]"
    },
    "kind": "SpanKind.CLIENT",
    "parent_id": null,
    "start_time": "2021-07-06T15:58:18.836636Z",
    "end_time": "2021-07-06T15:58:20.324545Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "http.method": "GET",
        "http.url": "http://www.facebook.com",
        "http.status_code": 200,
        "http.status_text": "OK"
    },
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "1.0.0",
        "service.name": "unknown_service"
    }
}
<Response [200]>
{
    "name": "example-request",
    "context": {
        "trace_id": "0x2563cc00750a278404f8fc3040094fc9",
        "span_id": "0x578ff34ec1f24f09",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": null,
    "start_time": "2021-07-06T15:58:18.824923Z",
    "end_time": "2021-07-06T15:58:20.327602Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {},
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "1.0.0",
        "service.name": "unknown_service"
    }
}
github-actions[bot] commented 2 years ago

This issue was marked stale due to lack of activity. It will be closed in 30 days.

pdavid31 commented 1 year ago

Is there any progress to this?

Unfortunately, I am seeing the same problem using

opentelemetry-instrumentation-requests 0.33b0
requests 2.26.0