grafana / beyla

eBPF-based autoinstrumentation of web applications and network metrics
https://grafana.com/oss/beyla-ebpf/
Apache License 2.0
1.34k stars 94 forks source link

OTLP: 401 error sending trace to consumer #1181

Open nicolasauler opened 21 hours ago

nicolasauler commented 21 hours ago

When following https://grafana.com/docs/beyla/latest/quickstart/rust/ or https://grafana.com/docs/beyla/latest/quickstart/golang/

I can't send data to Grafana Cloud from both metrics and traces.

Also, is this quickstart up-to-date? Can we do something to make more clear what is the value we need to send in OTEL_EXPORTER_OTLP_HEADERS?

time=2024-09-18T20:36:28.799-03:00 level=ERROR msg="error sending trace to consumer" error="not retryable error: Permanent error: rpc error: code = Unauthenticated desc = error exporting items, request to https://otlp-gateway-prod-sa-east-1.grafana.net/otlp/v1/traces responded with HTTP Status Code 401"
time=2024-09-18T20:36:33.662-03:00 level=INFO msg="failed to upload metrics: failed to send metrics to https://otlp-gateway-prod-sa-east-1.grafana.net/otlp/v1/metrics: 401 Unauthorized"

When attempting to add Beyla connection in https://onic.grafana.net/connections/add-new-connection/beyla

The env variables are directly returned in the script:

OTEL_EXPORTER_OTLP_METRICS_ENDPOINT: "https://otlp-gateway-prod-sa-east-1.grafana.net/otlp/v1/metrics"
OTEL_EXPORTER_OTLP_METRICS_HEADERS: "Authorization=Basic <value1>"
BEYLA_OTEL_METRIC_FEATURES: "application_span,application_service_graph,application,application_process"
BEYLA_NETWORK_METRICS: "true"
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT: "https://otlp-gateway-prod-sa-east-1.grafana.net/otlp/v1/traces"
OTEL_EXPORTER_OTLP_TRACES_HEADERS: "Authorization=Basic <value2>"

If we export those values, along with BEYLA_OPEN_PORT=8080 and BEYLA_TRACE_PRINTER=text.

We now get metrics working with Grafana Cloud, yay!!!

But traces still outputs the following error:

time=2024-09-18T20:24:27.349-03:00 level=ERROR msg="error sending trace to consumer" error="not retryable error: Permanent error: rpc error: code = Unauthenticated desc = error exporting items, request to https://otlp-gateway-prod-sa-east-1.grafana.net/otlp/v1/traces responded with HTTP Status Code 401"

Why does traces not work? Am I missing some piece of the puzzle? Also, as mentioned, I attempted this with both Rust and Go, and I'm using the Beyla v1.8.4-alpha from Github releases.

edit: I also attempted all cases with and without OTEL_EXPORTER_OTLP_PROTOCOL="http/protobuf". At this point I'm trying all combinations of env variations, as I'm terribly confuse wether I follow the quickstart or the "beyla add connection guide".

nicolasauler commented 21 hours ago

To aid reproducibility

I'm attempting to run all this from a nix flake:

# flake.nix
{
  description = "Reproducing beyla issues";

  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
    rust-overlay.url = "github:oxalica/rust-overlay";
    flake-utils.url = "github:numtide/flake-utils";
  };

  outputs = {
    nixpkgs,
    rust-overlay,
    flake-utils,
    ...
  }:
    flake-utils.lib.eachDefaultSystem (
      system: let
        beyla = pkgs.stdenv.mkDerivation {
          pname = "grafana_beyla";
          version = "v1.8.4-alpha";

          src = pkgs.fetchzip {
            url = "https://github.com/grafana/beyla/releases/download/v1.8.4-alpha/beyla-linux-amd64-v1.8.4-alpha.tar.gz";
            hash = "sha256-hlXgm71bMhmP2QYBTMsU4sEQILpAxdGOJd1DUr6cdW8=";
            stripRoot = false;
          };

          installPhase = ''
            mkdir -p $out/bin
            cp $src/beyla $out/bin
          '';
        };
        rust = pkgs.rust-bin.selectLatestNightlyWith (toolchain: toolchain.default);
        overlays = [(import rust-overlay)];
        pkgs = import nixpkgs {
          inherit system overlays;
        };
      in
        with pkgs; {
          devShells.default = mkShell {
            buildInputs = [
              beyla
              go
              rust
            ];

            shellHook = ''
              ## quickstart guide stuff: https://grafana.com/docs/beyla/latest/quickstart/rust/
              export BEYLA_OPEN_PORT=8080
              export BEYLA_TRACE_PRINTER=text
              export OTEL_EXPORTER_OTLP_PROTOCOL="http/protobuf"
              export OTEL_EXPORTER_OTLP_ENDPOINT="https://otlp-gateway-prod-sa-east-1.grafana.net/otlp"
              export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic <api_token?>"

              ## beyla connection steps: https://onic.grafana.net/connections/add-new-connection/beyla
              export OTEL_EXPORTER_OTLP_METRICS_ENDPOINT="https://otlp-gateway-prod-sa-east-1.grafana.net/otlp/v1/metrics"
              export OTEL_EXPORTER_OTLP_METRICS_HEADERS="Authorization=Basic <value1_returned_in_script>"
              export BEYLA_OTEL_METRIC_FEATURES="application_span,application_service_graph,application,application_process"
              export BEYLA_NETWORK_METRICS="true"
              export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT="https://otlp-gateway-prod-sa-east-1.grafana.net/otlp/v1/traces"
              export OTEL_EXPORTER_OTLP_TRACES_HEADERS="Authorization=Basic <value2_returned_in_script>"
            '';
          };
        }
    );
}

This could come in handy when trying to reproduce the issue. You: 1- replace onic.grafana.net with your Grafana Cloud subdomain 2- replace the placeholders with yours 3- run nix develop

ps: I'm aware no one has packaged beyla in nixpkgs yet. If after I'm able to run beyla successfully no one has already done it, I'll open the PR

nicolasauler commented 20 hours ago

Solution to quickstart

I got it to work, in the end it was really just confusion over how to actually quickstart it. The minimal env variables needed to work are really:

export BEYLA_OPEN_PORT=8080
export BEYLA_TRACE_PRINTER=text
export OTEL_EXPORTER_OTLP_PROTOCOL="http/protobuf"
export OTEL_EXPORTER_OTLP_ENDPOINT="https://otlp-gateway-prod-sa-east-1.grafana.net/otlp"
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic <token>"

but the token needs to be from https://grafana.com/orgs/<your_org>/stacks/<your_stack_id>/otlp-info.

Maybe solution?

I think the reason why my attempts at following https://onic.grafana.net/connections/add-new-connection/beyla didn't work was because of my config (or lack thereof) of alloy.

Final remark

The thing I think we can improve is the documentation at the quickstart, it took me 2 days to figure out where to get this specific api token from. But I understand how we can't just provide a link, since it's customized (https://grafana.com/orgs/<your_org>/stacks/<your_stack_id>/otlp-info).

grcevski commented 18 hours ago

Thanks for all of this feedback! I think we need to improve our documentation, if you would like to contribute a PR we'll gladly take it, if not I think this write up helps us a ton and we'll improve the docs.