ansible-collections / community.general

Ansible Community General Collection
https://galaxy.ansible.com/ui/repo/published/community/general/
GNU General Public License v3.0
823 stars 1.52k forks source link

open telemetry callback error: Transient error StatusCode.UNAVAILABLE #7888

Closed mihai-satmarean closed 5 months ago

mihai-satmarean commented 9 months ago

Summary

We have configured Ansible withe the Opentelemetry plugin and the Grafanacloud confugurations suggested on the https://docs.ansible.com/ansible/latest/collections/community/general/opentelemetry_callback.html page, including the token.

Issue Type

Bug Report

Component Name

community.general.opentelemetry

Ansible Version

$ ansible --version
ansible [core 2.13.1]
  config file = None
  configured module search path = ['/REDACTED/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.9/site-packages/ansible
  ansible collection location = /REDACTED/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible
  python version = 3.9.17 (main, Jun 15 2023, 07:46:17) [Clang 14.0.3 (clang-1403.0.22.14.1)]
  jinja version = 3.1.2
  libyaml = True

Community.general Version

$ ansible-galaxy collection list community.general

# /REDACTED/.ansible/collections/ansible_collections
Collection        Version
----------------- -------
community.general 8.2.0

Configuration

$ ansible-config dump --only-changed
CALLBACKS_ENABLED(/REDACTED/ansible.cfg) = ['community.general.opentelemetry']
CONFIG_FILE() = /REDACTED/ansible.cfg

OS / Environment

macosx, docker container with Ubuntu

Steps to Reproduce

Steps to reproduce the problem:

we configured Ansible like here: https://docs.ansible.com/ansible/latest/collections/community/general/opentelemetry_callback.html
mainly adding this in the ansible.cfg:
[defaults]
callbacksenabled = community.general.opentelemetry [callbackopentelemetry]
enablefromenvironment = ANSIBLEOPENTELEMETRYENABLED

created and configured the environment with the token and ID
export OTELEXPORTEROTLPPROTOCOL="http/protobuf" export OTELEXPORTEROTLPENDPOINT="https://otlp-gateway-prod-eu-west-2.grafana.net/otlp"
export OTELEXPORTEROTLP_HEADERS="Authorization=Basic REDACTED"
3.
Run playbooks and got this:

Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otlp-gateway-prod-eu-west-2.grafana.net, retrying in 1s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otlp-gateway-prod-eu-west-2.grafana.net, retrying in 2s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otlp-gateway-prod-eu-west-2.grafana.net, retrying in 4s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otlp-gateway-prod-eu-west-2.grafana.net, retrying in 8s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otlp-gateway-prod-eu-west-2.grafana.net, retrying in 16s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otlp-gateway-prod-eu-west-2.grafana.net, retrying in 32s.

We also have an open question: https://grafana.slack.com/archives/C038S86TZGF/p1705057683915289
and here: https://forum.ansible.com/t/anyone-managed-to-use-opentelemetry-from-awx-ansible/3005

Expected Results

Running Ansible playbooks should send traces to the grafana cloud

Actual Results


PLAY RECAP *********************************************************************
localhost                  : ok=9    changed=0    unreachable=0    failed=0    skipped=14   rescued=0    ignored=0

Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otlp-gateway-prod-eu-west-2.grafana.net, retrying in 1s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otlp-gateway-prod-eu-west-2.grafana.net, retrying in 2s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otlp-gateway-prod-eu-west-2.grafana.net, retrying in 4s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otlp-gateway-prod-eu-west-2.grafana.net, retrying in 8s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otlp-gateway-prod-eu-west-2.grafana.net, retrying in 16s.
Transient error StatusCode.UNAVAILABLE encountered while exporting traces to otlp-gateway-prod-eu-west-2.grafana.net, retrying in 32s.

Code of Conduct

ansibullbot commented 9 months ago

Files identified in the description:

If these files are incorrect, please update the component name section of the description or use the !component bot command.

click here for bot help

ansibullbot commented 9 months ago

cc @v1v click here for bot help

v1v commented 7 months ago

This particular implementation is vendor-agnostic.

Would you mind if I ask you to try the same settings but using an OpenTelemetry Collector^1 with some debug traces^2? Pretty much update OTEL_EXPORTER_OTLP_ENDPOINT to point to your collector. You can try that locally.

wilfriedroset commented 6 months ago

Would you mind if I ask you to try the same settings but using an OpenTelemetry Collector1 with some debug traces2? Pretty much update OTEL_EXPORTER_OTLP_ENDPOINT to point to your collector

I'm in the same situation with OTEL_EXPORTER_OTLP_ENDPOINT correctly define. I've other applications in the same network who manage to successfully send their traces.