elastic / apm-server

https://www.elastic.co/guide/en/apm/guide/current/index.html
Other
1.22k stars 525 forks source link

APM returns a 404 when attempting to send OTLP data #6267

Closed MeijerM1 closed 3 years ago

MeijerM1 commented 3 years ago

APM Server version (7.15.0):

Description of the problem including expected versus actual behavior: The APM server is returning a 404 error on the following url /opentelemetry.proto.collector.trace.v1.TraceService/Export when attempting to send OTLP data from an application.

Complete error log:

{"log.level":"error","@timestamp":"2021-09-30T13:35:42.705Z","log.logger":"request","log.origin":{"file.name":"middleware/log_middleware.go","file.line":60},"message":"404 page not found","service.name":"apm-server","url.original":"/opentelemetry.proto.collector.trace.v1.TraceService/Export","http.request.method":"POST","user_agent.original":"grpc-dotnet/2.32.0.0","source.address":"10.100.16.123","http.request.id":"e8315ebe-0877-42cd-bb98-93d7d9385963","event.duration":105601,"http.response.status_code":404,"error.message":"404 page not found","ecs.version":"1.6.0"}

Steps to reproduce:

  1. Instrument a .net core 3.1 application using the following OpenTelemetry packages
    <PackageReference Include="OpenTelemetry" Version="1.1.0" />
    <PackageReference Include="OpenTelemetry.Exporter.OpenTelemetryProtocol" Version="1.1.0" />
    <PackageReference Include="OpenTelemetry.Extensions.Hosting" Version="1.0.0-rc7" />
    <PackageReference Include="OpenTelemetry.Instrumentation.AspNetCore" Version="1.0.0-rc7" />
    <PackageReference Include="OpenTelemetry.Instrumentation.Http" Version="1.0.0-rc7" />
  2. Add the following configuration in the .net core app:
    services.AddOpenTelemetryTracing((builder) => builder
                .SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("MyApplication"))
                .AddAspNetCoreInstrumentation()
                .AddHttpClientInstrumentation(options =>
                {
                    options.SetHttpFlavor = true;
                })
                .AddOtlpExporter(options =>
                {
                    options.Endpoint = new Uri("http:/my-apm-server.my-namespace.svc:8200/);
                    options.Headers = "authorization=Bearer SecretToken";
                })
  3. Run APM on Kubernetes with the following configuration:
    apiVersion: apm.k8s.elastic.co/v1
    kind: ApmServer
    metadata:
    name: apm-server-test
    namespace: logging
    spec:
    version: 7.15.0
    count: 1
    kibanaRef:
    name: my-kibana
    elasticsearchRef:
    name: my-elastic
    secureSettings:
    - secretName: apm-server-test-apm-token
    http:
    tls: 
      selfSignedCertificate:
        disabled: true
    config:
    output:
      elasticsearch:
        hosts: ["http://my-elastic.my-namespace.svc:9200"]
        username: myUsername
        password: myPassword

Other relevant information

ECK Version: 1.7.1
Kubernetes: 1.21

Provide logs (if relevant):

axw commented 3 years ago

@MeijerM1 given that you're not using TLS, and you're using .NET Core 3.1, are you following https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Exporter.OpenTelemetryProtocol#special-case-when-using-insecure-channel?

MeijerM1 commented 3 years ago

That solved our issue, thanks!

gbschenkel commented 3 years ago

I am having the same problem, but using java agent, and APM is installed on OpenShift OpenShift 4.9 || ECK 1.8.0 || APM 7.15.1 || opentelemetry-java 1.7.1

apiVersion: apm.k8s.elastic.co/v1
kind: ApmServer
metadata:
  name: apm-dev
spec:
  version: 7.15.1
  count: 2
  config:
    monitoring:
      enable: true
  elasticsearchRef:
    name: elasticsearch-dev 
  kibanaRef:
    name: kibana-dev
  http:
    service:
      spec:
        type: LoadBalancer
  podTemplate:
    spec:
      containers:
        - name: apm-server
          resources:
            limits:
              cpu: '0.5'
              memory: 512Mi
            requests:
              cpu: '0.05'
              memory: 64Mi

The client was configured as instructed on https://www.elastic.co/guide/en/apm/get-started/current/open-telemetry-elastic.html (I will submit the code later since I don't have access right now.

I don't have much knowledge with OpenShift, I can't say if the problem is with the route or what.

Using Elastic protocol I am able to interact with APM, on HTTP 80, but with 443 it don't work, maybe because the auto signed certification or the reencrypt policy on openshift route(Kibana and ElasticSearch are working this way).

When trying to use Open Telemetry, I start get the error below:

{
  "log.level":"error",
  "@timestamp":"2021-11-08T18:29:26.536Z",
  "log.logger":"request",
  "log.origin": {
    "file.name":"middleware/log_middleware.go",
    "file.line":60
  },
  "message":"404 page not found",
  "service.name":"apm-server",
  "url.original":"/opentelemetry.proto.collector.trace.v1.TraceService/Export",
  "http.request.method":"POST",
  "user_agent.original":"grpc-go/1.41.0",
  "source.address":"10.7.105.10",
  "http.request.id":"1f75554c-e58e-4e0e-9cb5-9eb0663dbaa5",
  "event.duration":107114,
  "http.response.status_code":404,
  "error.message":"404 page not found",
  "ecs.version":"1.6.0"
}

I would like to know if this is bug or is just bad configured, and which part is/are the problem. Thanks

axw commented 3 years ago

@gbschenkel in future please open a new topic at discuss.elastic.co/c/observability/apm/58, rather than commenting on old issues.

The issue here is that OTLP/gRPC does not work with the LoadBalancer service type -- see https://kubernetes.io/blog/2018/11/07/grpc-load-balancing-on-kubernetes-without-tears/ for details, and some options of how to address this.

NicklasWallgren commented 10 months ago

I am having the same problem, but using java agent, and APM is installed on OpenShift OpenShift 4.9 || ECK 1.8.0 || APM 7.15.1 || opentelemetry-java 1.7.1

apiVersion: apm.k8s.elastic.co/v1
kind: ApmServer
metadata:
  name: apm-dev
spec:
  version: 7.15.1
  count: 2
  config:
    monitoring:
      enable: true
  elasticsearchRef:
    name: elasticsearch-dev 
  kibanaRef:
    name: kibana-dev
  http:
    service:
      spec:
        type: LoadBalancer
  podTemplate:
    spec:
      containers:
        - name: apm-server
          resources:
            limits:
              cpu: '0.5'
              memory: 512Mi
            requests:
              cpu: '0.05'
              memory: 64Mi

The client was configured as instructed on https://www.elastic.co/guide/en/apm/get-started/current/open-telemetry-elastic.html (I will submit the code later since I don't have access right now.

I don't have much knowledge with OpenShift, I can't say if the problem is with the route or what.

Using Elastic protocol I am able to interact with APM, on HTTP 80, but with 443 it don't work, maybe because the auto signed certification or the reencrypt policy on openshift route(Kibana and ElasticSearch are working this way).

When trying to use Open Telemetry, I start get the error below:

{
  "log.level":"error",
  "@timestamp":"2021-11-08T18:29:26.536Z",
  "log.logger":"request",
  "log.origin": {
    "file.name":"middleware/log_middleware.go",
    "file.line":60
  },
  "message":"404 page not found",
  "service.name":"apm-server",
  "url.original":"/opentelemetry.proto.collector.trace.v1.TraceService/Export",
  "http.request.method":"POST",
  "user_agent.original":"grpc-go/1.41.0",
  "source.address":"10.7.105.10",
  "http.request.id":"1f75554c-e58e-4e0e-9cb5-9eb0663dbaa5",
  "event.duration":107114,
  "http.response.status_code":404,
  "error.message":"404 page not found",
  "ecs.version":"1.6.0"
}

I would like to know if this is bug or is just bad configured, and which part is/are the problem. Thanks

Did you solve the issue?, we have encountered the same problem.