signalfx / splunk-otel-dotnet

Splunk Distribution of OpenTelemetry .NET
https://docs.splunk.com/Observability/gdi/get-data-in/application/otel-dotnet/get-started.html
Apache License 2.0
7 stars 10 forks source link

Google.Protobuf package version incompatibility #526

Closed ArtemiiUstiukhinFortum closed 1 month ago

ArtemiiUstiukhinFortum commented 1 month ago

Describe the bug OpenTelemetry.AutoInstrumentation.Loader.Loader throws an exception Could not load file or assembly 'Google.Protobuf, Version=3.22.5.0 when instrumented .NET App contains different version of the Google.Protobuf package.

To Reproduce Steps to reproduce the behavior:

  1. Create .NET application with the Google.Protobuf package version 3.19.1.
  2. Create a Docker container with the .NET application, utilizing the Splunk Distribution of OpenTelemetry for .NET (See example here)
  3. Set the OTEL_LOG_LEVEL environment variable to debug before starting your instrumented application.
  4. Run Dockerized application and generate some activity.
  5. Collect the debug logs. The instrumentation saves its logs in the following locations: Linux: /var/log/opentelemetry/dotnet

Expected behavior Splunk Distribution of OpenTelemetry .NET is running without errors.

Runtime environment:

OTEL ENV:

- name: SPLUNK_OTEL_AGENT
  valueFrom:
    fieldRef:
      fieldPath: status.hostIP
- name: OTEL_EXPORTER_OTLP_ENDPOINT
  value: "http://$(SPLUNK_OTEL_AGENT):4318"
- name: OTEL_SERVICE_NAME
  value: "app-name"
- name: OTEL_RESOURCE_ATTRIBUTES
  value: "deployment.environment=${ENV}"
- name: OTEL_DOTNET_AUTO_HOME
  value: "$HOME/.splunk-otel-dotnet"
- name: OTEL_DOTNET_AUTO_METRICS_INSTRUMENTATION_ENABLED
  value: "false"

Solution A simple Google.Protobuf upgrade to Version=3.22.5.0 fixes the issue. However it would be nice to know about incompatible packages in advance, because the only way to figure out the issue was to set the OTEL_LOG_LEVEL environment variable to debug and manually inspect the logs.

Kielek commented 1 month ago

@ArtemiiUstiukhinFortum, it is known limitation, the easiest way to mitigate the issue is to always use nuget package. You can read about this on Splunk documentation.

If you are interested in more techical details you can always check upstream documentation. Keep in mind that package names, script names, configuration might be vary between these 2 distributions.

For now, there is nothing what we can do in general.

Please let me know if I can help you with other topics.