open-telemetry / opentelemetry-python

OpenTelemetry Python API and SDK
https://opentelemetry.io
Apache License 2.0
1.77k stars 619 forks source link

Auto-instrumentation issue after 0.16b1 #1622

Closed markfink-splunk closed 3 years ago

markfink-splunk commented 3 years ago

Describe your environment Instrumenting this service: https://github.com/markfink-splunk/microservices-demo/tree/master/src/emailservice

The current commit is with 0.16b1 which works fine with the "opentelemetry-instrument" wrapper (see Dockerfile entrypoint).

When I upgrade to anything after 0.16b1 -- say 0.17b0, 0.18b1, 1.0.0rc1, etc -- I get no traces with the wrapper. I updated the environment variables (see below). I'm getting no traces for any instrumentation, particularly grpc but I also tried requests.

You can see the correct libraries added with requirements.in/txt.

My environment looks like this; export OTEL_TRACES_EXPORTER="otlp" export OTEL_EXPORTER_OTLP_INSECURE=true export OTEL_EXPORTER_OTLP_ENDPOINT="markf-0398:4317" export OTEL_RESOURCE_ATTRIBUTES="service.name=emailservice, environment=hipster_shop"

All is well with 0.16b1, just nothing after that. I'm using grpc_health_probe to test it.

What is the expected behavior? I should see grpc traces.

What is the actual behavior? What did you see instead? No traces. Nothing is being sent to the Otel Collector. I checked with ngrep. No connection attempt is made.

I've tried in vain to enable debug logging for this app. It looks like it should be supported by the Otel tracer, but when I set the logging level to debug, it produces no additional output. I set it in logger.py in the project and I also tried "OTEL_LOG_LEVEL".

Could someone with deeper skills look at this? This seems like a pretty straightforward app. Or perhaps explain how to enable debug logging.

srikanthccv commented 3 years ago

Might be same as https://github.com/open-telemetry/opentelemetry-python/issues/1577. Please install opentelemetry-distro==0.17b0. Please let us know if it doesn't solve the issue.

srikanthccv commented 3 years ago

cc // @codeboten

markfink-splunk commented 3 years ago

That helped. I'm getting traces now. However, I am seeing other issues that I did not see before. Here's some context: opentelemetry-sdk==1.0.0rc1 opentelemetry-exporter-otlp==1.0.0rc1 opentelemetry-exporter-jaeger==1.0.0rc1 opentelemetry-propagator-b3==1.0.0rc1 opentelemetry-distro==0.18b0 opentelemetry-instrumentation==0.18b0 opentelemetry-instrumentation-grpc==0.18b0 opentelemetry-instrumentation-jinja2==0.18b0

export OTEL_TRACES_EXPORTER="otlp" export OTEL_EXPORTER_OTLP_INSECURE=true export OTEL_EXPORTER_OTLP_ENDPOINT="markf-0398:4317" export OTEL_EXPORTER_JAEGER_ENDPOINT="http://markf-0398:14268/api/traces" export OTEL_RESOURCE_ATTRIBUTES="service.name=emailservice, environment=hipster_shop"

I am toggling OTEL_TRACES_EXPORTER between "jaeger" and "otlp".

With the otlp exporter, I am getting this error which is new since 0.16b1 (otlp worked fine with 016.b1): Configuration of configurator failed Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/opentelemetry/instrumentation/auto_instrumentation/sitecustomize.py", line 74, in _load_configurators entry_point.load()().configure() # type: ignore File "/usr/local/lib/python3.7/site-packages/opentelemetry/instrumentation/configurator.py", line 50, in configure self._configure(**kwargs) File "/usr/local/lib/python3.7/site-packages/opentelemetry/distro/__init__.py", line 168, in _configure _initialize_components() File "/usr/local/lib/python3.7/site-packages/opentelemetry/distro/__init__.py", line 159, in _initialize_components exporter_names = _get_exporter_names() File "/usr/local/lib/python3.7/site-packages/opentelemetry/distro/__init__.py", line 73, in _get_exporter_names exporters.pop(EXPORTER_OTLP) TypeError: pop() takes no arguments (1 given) Failed to auto initialize opentelemetry Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/opentelemetry/instrumentation/auto_instrumentation/sitecustomize.py", line 84, in initialize _load_configurators() File "/usr/local/lib/python3.7/site-packages/opentelemetry/instrumentation/auto_instrumentation/sitecustomize.py", line 78, in _load_configurators raise exc File "/usr/local/lib/python3.7/site-packages/opentelemetry/instrumentation/auto_instrumentation/sitecustomize.py", line 74, in _load_configurators entry_point.load()().configure() # type: ignore File "/usr/local/lib/python3.7/site-packages/opentelemetry/instrumentation/configurator.py", line 50, in configure self._configure(**kwargs) File "/usr/local/lib/python3.7/site-packages/opentelemetry/distro/__init__.py", line 168, in _configure _initialize_components() File "/usr/local/lib/python3.7/site-packages/opentelemetry/distro/__init__.py", line 159, in _initialize_components exporter_names = _get_exporter_names() File "/usr/local/lib/python3.7/site-packages/opentelemetry/distro/__init__.py", line 73, in _get_exporter_names exporters.pop(EXPORTER_OTLP) TypeError: pop() takes no arguments (1 given)

So I switched over to the Jaeger exporter and found that it is not honoring the service.name set with OTEL_RESOURCE_ATTRIBUTES. service.name shows up as "unknown_service". This may not be a new issue since I did not test this previously. It is setting the environment tag though.

Also Jaeger insists on Thrift/HTTP vs gRPC. The code appears to support gRPC but I cannot get it working with pure auto-instrumentation. It looks like I'd have to configure the exporter manually in code. Iow, if I set OTEL_EXPORTER_JAEGER_ENDPOINT to "http://markf-0398:14250", I get this: ERROR:opentelemetry.sdk.trace.export:Exception while exporting Span batch. Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/opentelemetry/sdk/trace/export/__init__.py", line 323, in _export_batch self.span_exporter.export(self.spans_list[:idx]) # type: ignore File "/usr/local/lib/python3.7/site-packages/opentelemetry/exporter/jaeger/__init__.py", line 236, in export self._collector_http_client.submit(batch) File "/usr/local/lib/python3.7/site-packages/opentelemetry/exporter/jaeger/send/thrift.py", line 110, in submit self.http_transport.flush() File "/usr/local/lib/python3.7/site-packages/thrift/transport/THttpClient.py", line 184, in flush self.__http_response = self.__http.getresponse() File "/usr/local/lib/python3.7/http/client.py", line 1369, in getresponse response.begin() File "/usr/local/lib/python3.7/http/client.py", line 310, in begin version, status, reason = self._read_status() File "/usr/local/lib/python3.7/http/client.py", line 292, in _read_status raise BadStatusLine(line)

srikanthccv commented 3 years ago

TypeError: pop() takes no arguments (1 given)

Would you mind creating another issue for this? This is specific to OTLP.

the Jaeger exporter and found that it is not honoring the service.name set with OTEL_RESOURCE_ATTRIBUTES.

Yes, as of now this is case. There is an issue created for this already https://github.com/open-telemetry/opentelemetry-python/issues/1607.

Also Jaeger insists on Thrift/HTTP vs gRPC. The code appears to support gRPC but I cannot get it working with pure auto-instrumentation.

I don't think there is any insistence, Jaeger sends traces to the agent(Thrift/UDP) by default unless you set the collector endpoint to use with either HTTP or gRPC way. The Exception you shared is not clear. Could you please share the full stack trace (formatted)?

markfink-splunk commented 3 years ago

I opened another issue for the pop() error. Sorry about the formatting, I gave up trying to use "insert code"; it strips out the linefeeds no matter how I do it.

For the Jaeger issue, maybe the question should be: how do I configure it to use gRPC vs HTTP with just environment variables and the runtime wrapper? Here's what I'm seeing:

OTEL_EXPORTER_JAEGER_ENDPOINT="http://markf-0398:14268/api/traces" works fine. This is HTTP.

OTEL_EXPORTER_JAEGER_ENDPOINT="markf-0398:14250" or "http://markf-0398:14250" both fail with:

ERROR:opentelemetry.sdk.trace.export:Exception while exporting Span batch. Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/opentelemetry/sdk/trace/export/init.py", line 323, in _export_batch self.span_exporter.export(self.spans_list[:idx]) # type: ignore File "/usr/local/lib/python3.7/site-packages/opentelemetry/exporter/jaeger/init.py", line 236, in export self._collector_http_client.submit(batch) File "/usr/local/lib/python3.7/site-packages/opentelemetry/exporter/jaeger/send/thrift.py", line 110, in submit self.http_transport.flush() File "/usr/local/lib/python3.7/site-packages/thrift/transport/THttpClient.py", line 184, in flush self.__http_response = self.__http.getresponse() File "/usr/local/lib/python3.7/http/client.py", line 1369, in getresponse response.begin() File "/usr/local/lib/python3.7/http/client.py", line 310, in begin version, status, reason = self._read_status() File "/usr/local/lib/python3.7/http/client.py", line 292, in _read_status raise BadStatusLine(line) http.client.BadStatusLine: @

It looks like it is still using HTTP with a gRPC endpoint and therefore failing, which is not surprising, I guess. My thought was that using port 14250 would trigger it to use gRPC because It worked that way with Java Otel up until Java removed support for Thrift/HTTP completely a few weeks ago.

I don't think that's how it ought to work, granted. It would be better to explicitly tell it to use gRPC.

In the end, this is not a burning issue for me, just bringing it up in the interest of achieving perfection.

seemk commented 3 years ago

I think the question here is choosing between the following Jaeger endpoints/formats:

I'll create a PR to expose Jaeger's transport format to env vars and improve the documentation a bit.

github-actions[bot] commented 3 years ago

This issue was marked stale due to lack of activity. It will be closed in 30 days.

codeboten commented 3 years ago

This was fixed by https://github.com/open-telemetry/opentelemetry-python/pull/1657