jaegertracing / helm-charts

Helm Charts for Jaeger backend
Apache License 2.0
258 stars 340 forks source link

[Bug]:"Failed to init storage factory","error":"failed to create primary Elasticsearch client: health check timeout: no Elasticsearch node available" #443

Open navin-rai opened 1 year ago

navin-rai commented 1 year ago

What happened?

I am using AWS Elasticsearch and trying to use it in jaeger, I have set the endpoints as per documentation, I am using latest version of jaeger helm chart apiVersion: v2, appVersion: 1.39.0, below is the config I am using for elasticsearch elasticsearch: scheme: https host: search-*****.us-east-1.es.amazonaws.com port: 443 user: elastic usePassword: true password: *

Steps to reproduce

Add AWS ES endpoints helm install jaeger

Expected behavior

Jaeger Collector and Jaeger Query should deploy properly on Kubernetes.

Relevant log output

2023/02/01 13:18:20 maxprocs: Leaving GOMAXPROCS=24: CPU quota undefined
{"level":"info","ts":1675257500.4474466,"caller":"flags/service.go:119","msg":"Mounting metrics handler on admin server","route":"/metrics"}
{"level":"info","ts":1675257500.447506,"caller":"flags/service.go:125","msg":"Mounting expvar handler on admin server","route":"/debug/vars"}
{"level":"info","ts":1675257500.4477003,"caller":"flags/admin.go:129","msg":"Mounting health check on admin server","route":"/"}
{"level":"info","ts":1675257500.4477556,"caller":"flags/admin.go:143","msg":"Starting admin HTTP server","http-addr":":14269"}
{"level":"info","ts":1675257500.447806,"caller":"flags/admin.go:121","msg":"Admin server started","http.host-port":"[::]:14269","health-status":"unavailable"}
{"level":"fatal","ts":1675257506.1148498,"caller":"./main.go:82","msg":"Failed to init storage factory","error":"failed to create primary Elasticsearch client: health check timeout: no Elasticsearch node available","stacktrace":"main.main.func1\n\t./main.go:82\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/cobra@v1.6.1/command.go:916\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/cobra@v1.6.1/command.go:1044\ngithub.com/spf13/cobra.(*Command).Execute\n\tgithub.com/spf13/cobra@v1.6.1/command.go:968\nmain.main\n\t./main.go:155\nruntime.main\n\truntime/proc.go:250"}

Screenshot

No response

Additional context

No response

Jaeger backend version

No response

SDK

No response

Pipeline

No response

Stogage backend

AWS Elasticsearch

Operating system

No response

Deployment model

No response

Deployment configs

No response

mehta-ankit commented 1 year ago

@navin-rai is your issue similar to this one: https://github.com/jaegertracing/helm-charts/issues/441 by any chance ? I don't use opensearch so I don't know what could go wrong with it when using it with jaeger deployed using this helm chart.

navin-rai commented 1 year ago

@navin-rai is your issue similar to this one: #441 by any chance ? I don't use opensearch so I don't know what could go wrong with it when using it with jaeger deployed using this helm chart.

I tried the solution given in PR, but it didn't work.

klubi commented 1 year ago

@navin-rai did you enable fine-grained-access on OpenSearch domain? If yes, then proper credentials must be provided, if not, then you can't pass username nor password as environment variables. Another thing is AWS level policies, can you confirm that pods running in your cluster are able to correctly resolve OpenSearch address?

navin-rai commented 1 year ago

@klubi , Hi, So here is the thing what I am trying to do, I have AWS ES created, My jaeger instance is not on AWS it is on prem. The solution which you provided gives me below manifest for collector-deployment(similar for query-deployment)

Source: jaeger/templates/collector-deploy.yaml

apiVersion: apps/v1 kind: Deployment metadata: name: jaeger-collector labels: helm.sh/chart: jaeger-0.67.0 app.kubernetes.io/name: jaeger app.kubernetes.io/instance: jaeger app.kubernetes.io/version: "1.39.0" app.kubernetes.io/managed-by: Helm app.kubernetes.io/component: collector spec: selector: matchLabels: app.kubernetes.io/name: jaeger app.kubernetes.io/instance: jaeger app.kubernetes.io/component: collector template: metadata: annotations: checksum/config-env: dba5166ad9db9ba648c1032ebbd34dcd0d085b50023b839ef5c68ca1db93a563 labels: app.kubernetes.io/name: jaeger app.kubernetes.io/instance: jaeger app.kubernetes.io/component: collector spec: securityContext: {} serviceAccountName: jaeger-collector containers:

navin-rai commented 1 year ago

@klubi I am not sure, is there any possibility to use AWS Secret key & Access key ?

klubi commented 1 year ago

No, that's a completely different mechanism. My PR was not merged yet, so you can't use it yet. What you can do to test your case is remove below lines from generated manifest.

- name: ES_USERNAME
  value: elastic
- name: ES_PASSWORD
  valueFrom:
    secretKeyRef:
      name: jaeger-elasticsearch

also, you'd have to add below to your collector values

cmdlineParams:
      es.tls.enabled: true
      es.tls.skip-host-verify: true