jaegertracing / jaeger-operator

Jaeger Operator for Kubernetes simplifies deploying and running Jaeger on Kubernetes.
https://www.jaegertracing.io/docs/latest/operator/
Apache License 2.0
1.02k stars 345 forks source link

failed to create primary Elasticsearch client with health check timeout - no Elasticsearch node available #496

Closed vishnuhd closed 5 years ago

vishnuhd commented 5 years ago

Hi,

I am trying to setup Elasticsearch storage with Jaeger operator, but it keeps failing for the initial healthcheck.

Jaeger operator version - v1.12.1 K8s version - v1.12.8 Elasticsearch operator version - v0.8.1 Elasticsearch version - v7.2.0

The operators have been deployed and running. When the Jaeger instance is deployed with es backend, the collector / query pods fails with :

$ kubectl logs my-jaeger-collector-8477b87bfd-sdcq4
{"level":"info","ts":1562160343.721835,"caller":"flags/service.go:113","msg":"Mounting metrics handler on admin server","route":"/metrics"}
{"level":"info","ts":1562160343.7220967,"caller":"flags/admin.go:108","msg":"Mounting health check on admin server","route":"/"}
{"level":"info","ts":1562160343.7221494,"caller":"flags/admin.go:114","msg":"Starting admin HTTP server","http-port":14269}
{"level":"info","ts":1562160343.7221606,"caller":"flags/admin.go:100","msg":"Admin server started","http-port":14269,"health-status":"unavailable"}
{"level":"fatal","ts":1562160348.7503304,"caller":"collector/main.go:87","msg":"Failed to init storage factory","error":"failed to create primary Elasticsearch client: health check timeout: Head http://quickstart-es:9200: EOF: no Elasticsearch node available","errorVerbose":"no Elasticsearch node available\ngithub.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic%2ev5.init.ializers\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic.v5/client.go:88\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:188\nruntime.goexit\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/asm_amd64.s:1337\nhealth check timeout: Head http://quickstart-es:9200: EOF
$ kubectl get po
NAME                                  READY     STATUS             RESTARTS   AGE
my-jaeger-agent-daemonset-pxbnn       1/1       Running            0          5m
my-jaeger-agent-daemonset-v4x47       1/1       Running            0          5m
my-jaeger-agent-daemonset-v8fc9       1/1       Running            0          5m
my-jaeger-collector-f9d9656db-bwcb6   0/1       CrashLoopBackOff   5          5m
my-jaeger-query-74576c764-fxj7h       1/2       CrashLoopBackOff   5          5m

Jaeger operator logs :

$ kubectl logs jaeger-operator-5ddcb7c446-ksj9m -n observability
time="2019-07-03T12:01:17Z" level=info msg=Versions arch=amd64 jaeger-operator=1.12.1 operator-sdk=v0.8.1 os=linux version=go1.12.5
time="2019-07-03T12:01:18Z" level=info msg="Auto-detected the platform" platform=kubernetes
time="2019-07-03T12:01:18Z" level=info msg="Automatically adjusted the 'es-provision' flag" es-provision=false
time="2019-07-03T13:14:22Z" level=error msg="failed to apply the changes" error="timed out waiting for the condition" execution="2019-07-03 13:09:22.662415944 +0000 UTC" instance=my-jaeger namespace=default
time="2019-07-03T13:19:23Z" level=error msg="failed to apply the changes" error="timed out waiting for the condition" execution="2019-07-03 13:14:23.743258079 +0000 UTC" instance=my-jaeger namespace=default
time="2019-07-03T13:24:24Z" level=error msg="failed to apply the changes" error="timed out waiting for the condition" execution="2019-07-03 13:19:24.779557298 +0000 UTC" instance=my-jaeger namespace=default

My spec file looks like :

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: my-jaeger
spec:
  strategy: production
  ui:
    options:
      dependencies:
        menuEnabled: false
      tracking:
        gaID: UA-000000-2
      menu:
        - label: "About Jaeger"
          items:
            - label: "Documentation"
              url: "https://www.jaegertracing.io/docs/latest"
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: http://quickstart-es:9200
    secretName: mysecret
  ingress:
    enabled: false
  agent:
    strategy: DaemonSet
  annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ""

ES svc has been created and running :

$ kubectl get svc quickstart-es
NAME            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
quickstart-es   ClusterIP   100.66.174.252   <none>        9200/TCP   4h

Elasticsearch cluster is in the same k8s cluster and namespace (I think) :

$ curl -u "elastic:$PASSWORD" -k "https://localhost:9200"
{
  "name" : "quickstart-es-96dchsxw4f",
  "cluster_name" : "quickstart",
  "cluster_uuid" : "someuuid",
  "version" : {
    "number" : "7.2.0",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "508c38a",
    "build_date" : "2019-06-20T15:54:18.811730Z",
    "build_snapshot" : false,
    "lucene_version" : "8.0.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

I have even tried to give the full DNS in jaeger config like http://quickstart-es.default.svc.cluster.local:9200 and even the clusterIP.

My secret file looks like (Its been generated using $(kubectl get secret quickstart-elastic-user -o=jsonpath='{.data.elastic}' | base64 --decode) and username elastic) -

apiVersion: v1
kind: Secret
metadata:
  name: mysecret
type: Opaque
data:
  ES_USERNAME: ZWxhc3RpYw==
  ES_PASSWORD: somepasswordinbase64

There are some more error logs in es cluster pod :

{"type": "server", "timestamp": "2019-07-03T13:47:18,707+0000", "level": "WARN", "component": "o.e.h.AbstractHttpServerTransport", "cluster.name": "quickstart", "node.name": "quickstart-es-96dchsxw4f", "cluster.uuid": "someuuid", "node.id": "someid",  "message": "caught exception while handling client http traffic, closing connection Netty4HttpChannel{localAddress=0.0.0.0/0.0.0.0:9200, remoteAddress=/xx.xx.x.xx:51134}" ,
"stacktrace": ["io.netty.handler.codec.DecoderException: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 48454144202f2048545450....",

Can anyone please help with this case ? Thanks.

vishnuhd commented 5 years ago

[Update]

The issue is might be with the SSL connection needed to make connection with Elasticsearch. However, there are no docs around both the operators that can help with the same.

Logs says after using https :

x509: certificate signed by unknown authority
vishnuhd commented 5 years ago

Figured it out, it would be something like this :

spec:
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: https://${ES_URL}
        tls: "true"
        tls.ca: /tls/tls.ca
        tls.cert: /tls/tls.crt
        tls.key: /tls/tls.key
(…)
volumeMounts:
    - name: es-tls
      mountPath: /tls
  volumes:
    - name: es-tls
      secret:
        secretName: es-tls
secat commented 5 years ago

@vishnuhd

It looks like you are using the elastic cloud on kubernetes operator.

Have you been able to generate the secret es-tls?

secat commented 5 years ago

I did not figure out what to take to generate the es-tls secret.

I have this error:

{"level":"fatal","ts":1562261263.8572745,"caller":"collector/main.go:87","msg":"Failed to init storage factory","error":"failed to create primary Elasticsearch client: tls: private key does not match public key","errorVerbose":"tls: private key does not match public key\nfailed to create primary Elasticsearch client\ngithub.com/jaegertracing/jaeger/plugin/storage/es.(*Factory).Initialize\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/plugin/storage/es/factory.go:82\ngithub.com/jaegertracing/jaeger/plugin/storage.(*Factory).Initialize\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/plugin/storage/factory.go:107\nmain.main.func1\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:86\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:762\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:177\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:200\nruntime.goexit\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/asm_amd64.s:1337","stacktrace":"main.main.func1\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:87\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:762\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:177\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:200"}
secat commented 5 years ago

@vishnuhd I am currently using:

secat commented 5 years ago

The elasticsearch operator generates the following secrets:

NAME                               TYPE                                  DATA   AGE
quickstart-ca                      Opaque                                1      4h43m
quickstart-ca-private-key          Opaque                                1      4h43m
quickstart-elastic-user            Opaque                                1      4h43m
quickstart-es                      Opaque                                2      4h41m
quickstart-es-7sc6vtf2xx-certs     Opaque                                3      4h43m
quickstart-es-7sc6vtf2xx-config    Opaque                                1      4h43m
quickstart-es-ca                   Opaque                                1      4h43m
quickstart-es-roles-users          Opaque                                3      4h43m
quickstart-extrafiles              Opaque                                1      4h43m
quickstart-internal-users          Opaque                                3      4h43m
quickstart-kibana-user             Opaque                                1      4h43m
quickstart-secure-settings         Opaque                                0      4h43m

I am using the content of the quickstart-es-7sc6vtf2xx-certs secret (i.e. ca.pem, and cert.pem) to generate respectively the tls.ca and tls.crt. And I am using the quickstart-ca-private-key secret (i.e. private.key) to generate the tls.key.

However it is not working right now:

{"level":"info","ts":1562264781.8849409,"caller":"flags/service.go:113","msg":"Mounting metrics handler on admin server","route":"/metrics"}
{"level":"info","ts":1562264781.88511,"caller":"flags/admin.go:108","msg":"Mounting health check on admin server","route":"/"}
{"level":"info","ts":1562264781.8851607,"caller":"flags/admin.go:114","msg":"Starting admin HTTP server","http-port":14269}
{"level":"info","ts":1562264781.8852055,"caller":"flags/admin.go:100","msg":"Admin server started","http-port":14269,"health-status":"unavailable"}
{"level":"fatal","ts":1562264781.888368,"caller":"collector/main.go:87","msg":"Failed to init storage factory","error":"failed to create primary Elasticsearch client: tls: private key does not match public key","errorVerbose":"tls: private key does not match public key\nfailed to create primary Elasticsearch client\ngithub.com/jaegertracing/jaeger/plugin/storage/es.(*Factory).Initialize\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/plugin/storage/es/factory.go:82\ngithub.com/jaegertracing/jaeger/plugin/storage.(*Factory).Initialize\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/plugin/storage/factory.go:107\nmain.main.func1\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:86\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:762\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:177\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:200\nruntime.goexit\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/asm_amd64.s:1337","stacktrace":"main.main.func1\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:87\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:762\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:177\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:200"}

More specifically "failed to create primary Elasticsearch client: tls: private key does not match public key".

Here is how my es-tls.yaml secret looks like presently:

apiVersion: v1
kind: Secret
metadata:
  name: es-tls
  namespace: observability
type: Opaque
stringData:
  tls.ca: |-
    -----BEGIN CERTIFICATE-----
    MIIDLDCCAhSgAwIBAgIQOIIPo2IGgAmK/fDePZiaFzANBgkqhkiG9w0BAQsFADAw
    MRkwFwYDVQQLExAyazZmdnZrcDhxbXR0Z3ZoMRMwEQYDVQQDEwpxdWlja3N0YXJ0
    MB4XDTE5MDcwNDEzNDU0MFoXDTIwMDcwMzEzNDY0MFowMDEZMBcGA1UECxMQMms2
    ZnZ2a3A4cW10dGd2aDETMBEGA1UEAxMKcXVpY2tzdGFydDCCASIwDQYJKoZIhvcN
    AQEBBQADggEPADCCAQoCggEBANBQ+0fnI1RLApmRHdLLhhF/E/LjWQIqrttZF79g
    UQD914BKxk/ufI98QkePySvpti+Z87hMRKJhWgJyY3OgyGEYVlHFJrti37EtL2Dv
    bryrMvA8d+C5TcqTB4hMmT5cIsL2GMm3YgE1RZk/YxS9D/B6rBradn9o7D9i7WC9
    U4HvxiwmPjnExendxnyr9GCPj6lOIG+CBhM0BsPiWSZAPtPX9yC2GvqAbqeGNcuw
    4RyKY3qWtkmeMom6SSuxUu5ujMiRguRftYfAaQc9KAwoKkClV34+FdT02kAuwpcy
    K7kDcqp3cq0hA+PsWk0cFmwa5uisV6CoPN72iOcb00Smq80CAwEAAaNCMEAwDgYD
    VR0PAQH/BAQDAgKEMB0GA1UdJQQWMBQGCCsGAQUFBwMBBggrBgEFBQcDAjAPBgNV
    HRMBAf8EBTADAQH/MA0GCSqGSIb3DQEBCwUAA4IBAQBkN8yYJkNqsvwoLA/w6qYa
    Iz6qAGAU1DAIHz1ScKBzLteIgVbQ/p/QsmNTJEGaGkLw+dwM+pyR4LqIs3xyiNIZ
    4M0qENRTK04Z59/xjvcj220znkJs/l8O0WE5LoGke6flOHtDfjz93y3sjnoQU9Du
    DW+Ywv2GEbr8GGoXJGVIN3xRG2NK2fSUWLmDPnXHm1O0J7n0VtoB1qbAVtqfU93a
    vkDvyw8PJ46ByaVYtVfDrhohVsArIk/ipFEMTKGlBEO2Bt03gOjsdrYWBybMImqc
    hthOsIPMtAgnpyUncwAf8FDq6CS9WnFnk/aH9FZrvkQ4TmnvgpK9g3aB3AY8tJvs
    -----END CERTIFICATE-----
  tls.crt: |-
    -----BEGIN CERTIFICATE-----
    MIIEvjCCA6agAwIBAgIQT0snr35ZZpnpD0hr9seuhTANBgkqhkiG9w0BAQsFADAw
    MRkwFwYDVQQLExAyazZmdnZrcDhxbXR0Z3ZoMRMwEQYDVQQDEwpxdWlja3N0YXJ0
    MB4XDTE5MDcwNDEzMzczMloXDTIwMDcwMzEzNDczMlowZzETMBEGA1UECxMKcXVp
    Y2tzdGFydDFQME4GA1UEAxNHcXVpY2tzdGFydC1lcy03c2M2dnRmMnh4Lm5vZGUu
    cXVpY2tzdGFydC5vYnNlcnZhYmlsaXR5LmVzLmNsdXN0ZXIubG9jYWwwggEiMA0G
    CSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDbqAwqEZQ/Jou3lpsPpj6kh5kviMXa
    Y2B55xr1DoAfqzDQ3dtgHxwmQ99dztnv05bqH6y1mx6kMVaBMSKLKSIgCZXMO9W7
    PDnm6Jg/4craNmaQcuSe+spxQPHf242hl89T2888DVknrY1mU8nddCEkzHKYu1qX
    blc5pMfHVkuIC24ZTrNfKnaNtGocxQqQmpHD1FS5UZCiKudoM2wGMsqgBgwjLrw4
    JCzIfK8n9TARp1/ZZ//OF4q4bpog3ki88kGVQ5jTONZrXuk7FtP95lhUV9v3n67s
    OuIzlU2vpj01Zl+NBOPEFdWFH4R3s+8KKm1MoZ19EFbzRG45YDrfcHsFAgMBAAGj
    ggGbMIIBlzAOBgNVHQ8BAf8EBAMCBaAwHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsG
    AQUFBwMCMIIBZAYDVR0RBIIBWzCCAVegUAYDVQQDoEkMR3F1aWNrc3RhcnQtZXMt
    N3NjNnZ0ZjJ4eC5ub2RlLnF1aWNrc3RhcnQub2JzZXJ2YWJpbGl0eS5lcy5jbHVz
    dGVyLmxvY2FsgkdxdWlja3N0YXJ0LWVzLTdzYzZ2dGYyeHgubm9kZS5xdWlja3N0
    YXJ0Lm9ic2VydmFiaWxpdHkuZXMuY2x1c3Rlci5sb2NhbIIYcXVpY2tzdGFydC1l
    cy03c2M2dnRmMnh4hwQKAQGjhwR/AAABhwQKZCoTgg1xdWlja3N0YXJ0LWVzgi1x
    dWlja3N0YXJ0LWVzLm9ic2VydmFiaWxpdHkuc3ZjLmNsdXN0ZXIubG9jYWyCF3F1
    aWNrc3RhcnQtZXMtZGlzY292ZXJ5gjdxdWlja3N0YXJ0LWVzLWRpc2NvdmVyeS5v
    YnNlcnZhYmlsaXR5LnN2Yy5jbHVzdGVyLmxvY2FsMA0GCSqGSIb3DQEBCwUAA4IB
    AQBt/qQGHkC8aJxSph+zfn+3fhvcj/g/7bH4IfWQq+zYtQ5uTbiSmVSg/t2zVvWO
    oR2CSwCCAfVllVY9zjDt+we//dbGFlWTaceXGodTU/yK+unVAi6mX9E5FyTtguRD
    ZdMukV1qF4CYkYYmbOo2LnmUNGe1hL7An0OEgnDo+8ZoTU5F1n7D1xSVmrc914xI
    /G+yx5Eiw0rctcADEQawUTPUjcZWq6CvQVf2B5+X6K33BURqxlQDGFhsJFFuD2io
    ZkQM9sNw8BZhhgch18AL26100HvxCLdF86ys3YArRloTw97k3Y7F0Nl40tQ5PnRW
    q1mIaCjxwhk3Gy1dTCfERyv8
    -----END CERTIFICATE-----
  tls.key: |-
    -----BEGIN RSA PRIVATE KEY-----
    MIIEpAIBAAKCAQEA0FD7R+cjVEsCmZEd0suGEX8T8uNZAiqu21kXv2BRAP3XgErG
    T+58j3xCR4/JK+m2L5nzuExEomFaAnJjc6DIYRhWUcUmu2LfsS0vYO9uvKsy8Dx3
    4LlNypMHiEyZPlwiwvYYybdiATVFmT9jFL0P8HqsGtp2f2jsP2LtYL1Tge/GLCY+
    OcTF6d3GfKv0YI+PqU4gb4IGEzQGw+JZJkA+09f3ILYa+oBup4Y1y7DhHIpjepa2
    SZ4yibpJK7FS7m6MyJGC5F+1h8BpBz0oDCgqQKVXfj4V1PTaQC7ClzIruQNyqndy
    rSED4+xaTRwWbBrm6KxXoKg83vaI5xvTRKarzQIDAQABAoIBAQCmh8aJcY59qUVX
    zHmh9Q+lVwh0iCi0obiNI4jndbDr8QFgzuYAKi+raPN3T8vLbhc1sIX0VAweH2Mc
    R6OXYPYvIIyI6+mNrXoTooKYpG/LJbUf9ccDgD9e7PD9lfZ/spobby7butz/CD4u
    R00G3Cks3nRNN025hwAtoAER7+gdG0axvkjocHbxIswYkViW8mE4d0hnzzl31Gdj
    iFgNZYySaMiStv5PObAE8p3+xWAINMmL2gNyEid7YsLg7uC0XWzy6IYnUuGI1DaI
    NqkJmroCO06oeKWvtLXO8U8lCaH6XpJdTbm+ARvy3JKkAoEwxwXfQoOxiIG068Xe
    3dUWwNihAoGBAPCYAp+vY0Lp/LxHWGVQg5S5vJXNHAYSwuTeItjJpbZeAvDtXWfq
    Qoc7Xv9WeNqWzg/VnAd+M05qSkoO6zv7gaP/s9xYC/nfe+Y8MZfyWY0/msT8YQNi
    tCQLjMa/+pZEJqL3nQYiUyR4CxH1L2U9qTsJXpsV3vK+EBTIezueRZX1AoGBAN2n
    2wjUV/KB9Z/3wuSqOvnmkgBhQ/DeVL8imVEVM5pAavil+OXZ1538zh2VpIwMZ2iH
    yRhFR6Islx7AvuiTkCVFXMmvwLpzqF3FrdT8sMi7ahOFs+YmYraWoyzzmRmvg/Mh
    THJHs2xZLW3FGqlqaHjcyQnbvzt46RkeSCZhvb95AoGAcWWkLvFyXnJ8fZ0+65m0
    OuAEI7Ll13L2Svrr/7OjGD5dMoMd+EFwk96G2uA93AEiJFJw1RNFSVtNonQ/qSjU
    pKB7fIo/MsmD0zNhyJUgYjOtVdUCQJ8/+pE7C94mVLbQYxVD/EUnXNP7m74tVZFn
    dvzmi0AWseClIbaQZrwlXhUCgYBWqEdT/mCb6P80mVLSv1LrXJ98EorTYrjTOR2j
    u5w/FCw+JfVXN4G6vJmAq353WmobTerq2DsXRkOWvFhm6ToTuDh8iX/Z5VnPv3ck
    q94ZvFvOYhlhQ2SYafBFpL8Ycawuo7gVfb7B/2NpZQP1dCqABiF6/zSWdcD8FwCy
    MMhUUQKBgQC75HNhRboeBUHTFpK56hnVqhCOAd+XnM9pNsP0fv/t6NdJ82kwY5bU
    TSuC82lJY1LNTvelt7NNrBH1QP67udU8iL+62Ewivst/3ng6uTIYeL3fQpsWeZPd
    WSf1BAL28xFlRkwhQK/k1ZuIxpcNq0GyypDsKDw4L8kGhX6ONLxhPA==
    -----END RSA PRIVATE KEY-----
vishnuhd commented 5 years ago

Hi, @secat I was too not able to configure the tls secret, it kept failing with bad certificate errors. I have had my plate full these days so couldn't look into it much deeper, so disabled the TLS of elasticsearch for time being. There are few suggestions I got from gitter :

Let us know if you were able to resolve this and how. Thanks.

jpkrohling commented 5 years ago

I assume you generated the certs by yourself

When I said that, it wasn't clear that es-operator was involved. In that case, it should be generating the certs for you. If you are using OpenShift and have the es-operator running, the Jaeger Operator can take care of provisioning the ES cluster for you.

secat commented 5 years ago

The certs where auto-generated by the ElasticSearch Operator (Elastic Cloud on Kubernetes (ECK) version).

However, I will try to connect everything without using TLS.

secat commented 5 years ago

I just realized the TLS can't be disabled on the ElasticSearch Operator...

vishnuhd commented 5 years ago

I just realized the TLS can't be disabled on the ElasticSearch Operator...

Yes, I had to install ElasticSearch manually via kube manifests. You can prefer helm as well.

secat commented 5 years ago

@jpkrohling I am not using OpenShift, I am using kubernetes over GKE

secat commented 5 years ago

@jpkrohling I tried the option tls.skip-host-verify: "true" and the collector is still verifying the certificates...

$ kubectl --namespace=observability logs local-jaeger-tracing-collector-57f86dc65d-fl9gj -f                                                                                                                                                   2019/07/05 15:06:27 maxprocs: Leaving GOMAXPROCS=6: CPU quota undefined
{"level":"info","ts":1562339187.077162,"caller":"flags/service.go:115","msg":"Mounting metrics handler on admin server","route":"/metrics"}
{"level":"info","ts":1562339187.0772986,"caller":"flags/admin.go:108","msg":"Mounting health check on admin server","route":"/"}
{"level":"info","ts":1562339187.077337,"caller":"flags/admin.go:114","msg":"Starting admin HTTP server","http-port":14269}
{"level":"info","ts":1562339187.0773468,"caller":"flags/admin.go:100","msg":"Admin server started","http-port":14269,"health-status":"unavailable"}
{"level":"fatal","ts":1562339192.1214948,"caller":"collector/main.go:89","msg":"Failed to init storage factory","error":"failed to create primary Elasticsearch client: health check timeout: Head https://quickstart-es-http.observability.svc:9200: x509: certificate is valid for quickstart-es-lqtqnnhx57.node.quickstart.observability.es.local, not quickstart-es-http.observability.svc: no Elasticsearch node available","errorVerbose":"no Elasticsearch node available\ngithub.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic%2ev5.init.ializers\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic.v5/client.go:88\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:188\nruntime.goexit\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/asm_amd64.s:1337\nhealth check timeout: Head https://quickstart-es-http.observability.svc:9200: x509: certificate is valid for quickstart-es-lqtqnnhx57.node.quickstart.observability.es.local, not quickstart-es-http.observability.svc\ngithub.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic%2ev5.(*Client).startupHealthcheck\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic.v5/client.go:1116\ngithub.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic%2ev5.NewClient\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic.v5/client.go:244\ngithub.com/jaegertracing/jaeger/pkg/es/config.(*Configuration).NewClient\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/pkg/es/config/config.go:100\ngithub.com/jaegertracing/jaeger/plugin/storage/es.(*Factory).Initialize\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/plugin/storage/es/factory.go:80\ngithub.com/jaegertracing/jaeger/plugin/storage.(*Factory).Initialize\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/plugin/storage/factory.go:107\nmain.main.func1\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:88\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:762\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:180\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:200\nruntime.goexit\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/asm_amd64.s:1337\nfailed to create primary Elasticsearch client\ngithub.com/jaegertracing/jaeger/plugin/storage/es.(*Factory).Initialize\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/plugin/storage/es/factory.go:82\ngithub.com/jaegertracing/jaeger/plugin/storage.(*Factory).Initialize\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/plugin/storage/factory.go:107\nmain.main.func1\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:88\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:762\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:180\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:200\nruntime.goexit\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/asm_amd64.s:1337","stacktrace":"main.main.func1\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:89\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:762\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:180\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:200"}

Is it possible to skip tls verification?

Here is the jaeger crd:

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: local-jaeger-tracing
  namespace: observability
spec:
  strategy: production
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: https://quickstart-es-http.observability.svc.cluster.local:9200
        tls.skip-host-verify: "true"
    secretName: quickstart-es
  ingress:
    enabled: false
  agent:
    strategy: Sidecar

Here part of the generated collector deployment by the Jaeger Operator:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    prometheus.io/port: "14269"
    prometheus.io/scrape: "true"
    sidecar.istio.io/inject: "false"
  creationTimestamp: "2019-07-05T15:21:22Z"
  generation: 1
  labels:
    app: jaeger
    app.kubernetes.io/component: collector
    app.kubernetes.io/instance: local-jaeger-tracing
    app.kubernetes.io/managed-by: jaeger-operator
    app.kubernetes.io/name: local-jaeger-tracing-collector
    app.kubernetes.io/part-of: jaeger
  name: local-jaeger-tracing-collector
  namespace: observability
  ownerReferences:
  - apiVersion: jaegertracing.io/v1
    controller: true
    kind: Jaeger
    name: local-jaeger-tracing
    uid: f6e47569-9f37-11e9-97cd-00155d25521a
  resourceVersion: "24889"
  selfLink: /apis/extensions/v1beta1/namespaces/observability/deployments/local-jaeger-tracing-collector
  uid: 8b9291d9-9f38-11e9-97cd-00155d25521a
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: jaeger
      app.kubernetes.io/component: collector
      app.kubernetes.io/instance: local-jaeger-tracing
      app.kubernetes.io/managed-by: jaeger-operator
      app.kubernetes.io/name: local-jaeger-tracing-collector
      app.kubernetes.io/part-of: jaeger
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        prometheus.io/port: "14269"
        prometheus.io/scrape: "true"
        sidecar.istio.io/inject: "false"
      creationTimestamp: null
      labels:
        app: jaeger
        app.kubernetes.io/component: collector
        app.kubernetes.io/instance: local-jaeger-tracing
        app.kubernetes.io/managed-by: jaeger-operator
        app.kubernetes.io/name: local-jaeger-tracing-collector
        app.kubernetes.io/part-of: jaeger
    spec:
      containers:
      - args:
        - --es.server-urls=https://quickstart-es-http.observability.svc.cluster.local:9200
        - --es.tls.skip-host-verify=true
        - --sampling.strategies-file=/etc/jaeger/sampling/sampling.json
        env:
        - name: SPAN_STORAGE_TYPE
          value: elasticsearch
        - name: COLLECTOR_ZIPKIN_HTTP_PORT
          value: "9411"
        envFrom:
        - secretRef:
            name: quickstart-es
        image: jaegertracing/jaeger-collector:1.13
        imagePullPolicy: IfNotPresent
        name: jaeger-collector
        ports:
        - containerPort: 9411
          name: zipkin
          protocol: TCP
        - containerPort: 14267
          name: c-tchan-trft
          protocol: TCP
        - containerPort: 14268
          name: c-binary-trft
          protocol: TCP
        - containerPort: 14269
          name: admin-http
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /
            port: 14269
            scheme: HTTP
          initialDelaySeconds: 1
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/jaeger/sampling
          name: local-jaeger-tracing-sampling-configuration-volume
          readOnly: true
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: local-jaeger-tracing
      serviceAccountName: local-jaeger-tracing
      terminationGracePeriodSeconds: 30
      volumes:
      - configMap:
          defaultMode: 420
          items:
          - key: sampling
            path: sampling.json
          name: local-jaeger-tracing-sampling-configuration
        name: local-jaeger-tracing-sampling-configuration-volume
secat commented 5 years ago

I was able to update ElasticSearch with SAN:

apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
kind: Elasticsearch
metadata:
  name: quickstart
  namespace: observability
spec:
  version: 6.8.0
  http:
    tls:
      selfSignedCertificate:
        subjectAltNames:
        - dns: quickstart-es-http.observability.svc.cluster.local
        - dns: quickstart-es-http
  nodes:
  - nodeCount: 1
    config:
      node.master: true
      node.data: true
      node.ingest: true

However, its is still not working.

Error: x509: certificate signed by unknown authority

With this jaerger cr:

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: local-jaeger-tracing
  namespace: observability
spec:
  strategy: production
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: https://quickstart-es-http.observability.svc.cluster.local:9200
        tls: "true"
        tls.ca: /certs/tls.ca
        tls.cert: /certs/tls.crt
        tls.key: /certs/tls.key
    secretName: quickstart-es
  ingress:
    enabled: false
  agent:
    strategy: Sidecar
  volumeMounts:
  - name: es-tls
    mountPath: /certs
  volumes:
  - name: es-tls
    secret:
      secretName: es-tls

I have these error logs in the collector:

2019/07/05 20:02:21 maxprocs: Leaving GOMAXPROCS=6: CPU quota undefined
{"level":"info","ts":1562356941.834654,"caller":"flags/service.go:115","msg":"Mounting metrics handler on admin server","route":"/metrics"}
{"level":"info","ts":1562356941.8347921,"caller":"flags/admin.go:108","msg":"Mounting health check on admin server","route":"/"}
{"level":"info","ts":1562356941.8348365,"caller":"flags/admin.go:114","msg":"Starting admin HTTP server","http-port":14269}
{"level":"info","ts":1562356941.834907,"caller":"flags/admin.go:100","msg":"Admin server started","http-port":14269,"health-status":"unavailable"}
{"level":"fatal","ts":1562356946.874876,"caller":"collector/main.go:89","msg":"Failed to init storage factory","error":"failed to create primary Elasticsearch client: health check timeout: Head https://quickstart-es-http.observability.svc.cluster.local:9200: x509: certificate signed by unknown authority: no Elasticsearch node available","errorVerbose":"no Elasticsearch node available\ngithub.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic%2ev5.init.ializers\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic.v5/client.go:88\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:188\nruntime.goexit\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/asm_amd64.s:1337\nhealth check timeout: Head https://quickstart-es-http.observability.svc.cluster.local:9200: x509: certificate signed by unknown authority\ngithub.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic%2ev5.(*Client).startupHealthcheck\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic.v5/client.go:1116\ngithub.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic%2ev5.NewClient\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/gopkg.in/olivere/elastic.v5/client.go:244\ngithub.com/jaegertracing/jaeger/pkg/es/config.(*Configuration).NewClient\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/pkg/es/config/config.go:100\ngithub.com/jaegertracing/jaeger/plugin/storage/es.(*Factory).Initialize\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/plugin/storage/es/factory.go:80\ngithub.com/jaegertracing/jaeger/plugin/storage.(*Factory).Initialize\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/plugin/storage/factory.go:107\nmain.main.func1\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:88\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:762\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:180\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:200\nruntime.goexit\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/asm_amd64.s:1337\nfailed to create primary Elasticsearch client\ngithub.com/jaegertracing/jaeger/plugin/storage/es.(*Factory).Initialize\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/plugin/storage/es/factory.go:82\ngithub.com/jaegertracing/jaeger/plugin/storage.(*Factory).Initialize\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/plugin/storage/factory.go:107\nmain.main.func1\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:88\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:762\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:180\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:200\nruntime.goexit\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/asm_amd64.s:1337","stacktrace":"main.main.func1\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:89\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:762\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).ExecuteC\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:852\ngithub.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra.(*Command).Execute\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/github.com/spf13/cobra/command.go:800\nmain.main\n\t/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/main.go:180\nruntime.main\n\t/home/travis/.gimme/versions/go1.12.1.linux.amd64/src/runtime/proc.go:200"}

Here is the generated collector deployment:

{
  "kind": "Deployment",
  "apiVersion": "extensions/v1beta1",
  "metadata": {
    "name": "local-jaeger-tracing-collector",
    "namespace": "observability",
    "selfLink": "/apis/extensions/v1beta1/namespaces/observability/deployments/local-jaeger-tracing-collector",
    "uid": "bb83f466-9f5f-11e9-97cd-00155d25521a",
    "resourceVersion": "59575",
    "generation": 1,
    "creationTimestamp": "2019-07-05T20:01:52Z",
    "labels": {
      "app": "jaeger",
      "app.kubernetes.io/component": "collector",
      "app.kubernetes.io/instance": "local-jaeger-tracing",
      "app.kubernetes.io/managed-by": "jaeger-operator",
      "app.kubernetes.io/name": "local-jaeger-tracing-collector",
      "app.kubernetes.io/part-of": "jaeger"
    },
    "annotations": {
      "deployment.kubernetes.io/revision": "1",
      "prometheus.io/port": "14269",
      "prometheus.io/scrape": "true",
      "sidecar.istio.io/inject": "false"
    },
    "ownerReferences": [
      {
        "apiVersion": "jaegertracing.io/v1",
        "kind": "Jaeger",
        "name": "local-jaeger-tracing",
        "uid": "abea51ca-9f5f-11e9-97cd-00155d25521a",
        "controller": true
      }
    ]
  },
  "spec": {
    "replicas": 1,
    "selector": {
      "matchLabels": {
        "app": "jaeger",
        "app.kubernetes.io/component": "collector",
        "app.kubernetes.io/instance": "local-jaeger-tracing",
        "app.kubernetes.io/managed-by": "jaeger-operator",
        "app.kubernetes.io/name": "local-jaeger-tracing-collector",
        "app.kubernetes.io/part-of": "jaeger"
      }
    },
    "template": {
      "metadata": {
        "creationTimestamp": null,
        "labels": {
          "app": "jaeger",
          "app.kubernetes.io/component": "collector",
          "app.kubernetes.io/instance": "local-jaeger-tracing",
          "app.kubernetes.io/managed-by": "jaeger-operator",
          "app.kubernetes.io/name": "local-jaeger-tracing-collector",
          "app.kubernetes.io/part-of": "jaeger"
        },
        "annotations": {
          "prometheus.io/port": "14269",
          "prometheus.io/scrape": "true",
          "sidecar.istio.io/inject": "false"
        }
      },
      "spec": {
        "volumes": [
          {
            "name": "es-tls",
            "secret": {
              "secretName": "es-tls",
              "defaultMode": 420
            }
          },
          {
            "name": "local-jaeger-tracing-sampling-configuration-volume",
            "configMap": {
              "name": "local-jaeger-tracing-sampling-configuration",
              "items": [
                {
                  "key": "sampling",
                  "path": "sampling.json"
                }
              ],
              "defaultMode": 420
            }
          }
        ],
        "containers": [
          {
            "name": "jaeger-collector",
            "image": "jaegertracing/jaeger-collector:1.13",
            "args": [
              "--es.server-urls=https://quickstart-es-http.observability.svc.cluster.local:9200",
              "--es.tls.ca=/certs/tls.ca",
              "--es.tls.cert=/certs/tls.crt",
              "--es.tls.key=/certs/tls.key",
              "--es.tls=true",
              "--sampling.strategies-file=/etc/jaeger/sampling/sampling.json"
            ],
            "ports": [
              {
                "name": "zipkin",
                "containerPort": 9411,
                "protocol": "TCP"
              },
              {
                "name": "c-tchan-trft",
                "containerPort": 14267,
                "protocol": "TCP"
              },
              {
                "name": "c-binary-trft",
                "containerPort": 14268,
                "protocol": "TCP"
              },
              {
                "name": "admin-http",
                "containerPort": 14269,
                "protocol": "TCP"
              }
            ],
            "envFrom": [
              {
                "secretRef": {
                  "name": "quickstart-es"
                }
              }
            ],
            "env": [
              {
                "name": "SPAN_STORAGE_TYPE",
                "value": "elasticsearch"
              },
              {
                "name": "COLLECTOR_ZIPKIN_HTTP_PORT",
                "value": "9411"
              }
            ],
            "resources": {},
            "volumeMounts": [
              {
                "name": "es-tls",
                "mountPath": "/certs"
              },
              {
                "name": "local-jaeger-tracing-sampling-configuration-volume",
                "readOnly": true,
                "mountPath": "/etc/jaeger/sampling"
              }
            ],
            "readinessProbe": {
              "httpGet": {
                "path": "/",
                "port": 14269,
                "scheme": "HTTP"
              },
              "initialDelaySeconds": 1,
              "timeoutSeconds": 1,
              "periodSeconds": 10,
              "successThreshold": 1,
              "failureThreshold": 3
            },
            "terminationMessagePath": "/dev/termination-log",
            "terminationMessagePolicy": "File",
            "imagePullPolicy": "IfNotPresent"
          }
        ],
        "restartPolicy": "Always",
        "terminationGracePeriodSeconds": 30,
        "dnsPolicy": "ClusterFirst",
        "serviceAccountName": "local-jaeger-tracing",
        "serviceAccount": "local-jaeger-tracing",
        "securityContext": {},
        "schedulerName": "default-scheduler"
      }
    },
    "strategy": {
      "type": "RollingUpdate",
      "rollingUpdate": {
        "maxUnavailable": "25%",
        "maxSurge": "25%"
      }
    },
    "revisionHistoryLimit": 10,
    "progressDeadlineSeconds": 600
  },
  "status": {
    "observedGeneration": 1,
    "replicas": 1,
    "updatedReplicas": 1,
    "unavailableReplicas": 1,
    "conditions": [
      {
        "type": "Available",
        "status": "False",
        "lastUpdateTime": "2019-07-05T20:01:52Z",
        "lastTransitionTime": "2019-07-05T20:01:52Z",
        "reason": "MinimumReplicasUnavailable",
        "message": "Deployment does not have minimum availability."
      },
      {
        "type": "Progressing",
        "status": "True",
        "lastUpdateTime": "2019-07-05T20:01:52Z",
        "lastTransitionTime": "2019-07-05T20:01:52Z",
        "reason": "ReplicaSetUpdated",
        "message": "ReplicaSet \"local-jaeger-tracing-collector-6bdd6ff87\" is progressing."
      }
    ]
  }
}
vishnuhd commented 5 years ago

@secat were you able to resolve it ?

secat commented 5 years ago

@vishnuhd I followed the instructions here to add alternatives SAN: https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-accessing-elastic-services.html#k8s-request-elasticsearch-endpoint

However, still not working as of this morning. It seems the client TLS is not enabled even with --es.tls=true...

secat commented 5 years ago

@vishnuhd I finally figured out how it works! I have installed delve inside the jaeger-collector and I did a remote debugging session in order to understand what should be the right configurations.

Here is my jaeger custom resource manifest:

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: local-jaeger-tracing
  namespace: observability
spec:
  strategy: allInOne
  allInOne:
    image: docker.io/jaegertracing/all-in-one:1.13
    options:
      collector:
        zipkin:
          http-port: "9411"
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: https://quickstart-es-http.observability.svc.cluster.local:9200
        tls.ca: /etc/ssl/certs/tls.crt
    secretName: quickstart-es
  ingress:
    enabled: false
  agent:
    strategy: Sidecar
  volumeMounts:
  - name: es-tls
    mountPath: /etc/ssl/certs
  volumes:
  - name: es-tls
    secret:
      secretName: quickstart-es-http-certs-public

NOTE: I have also successfully ran Jaeger with the production strategy.

I am using the ElasticSearch Operator from elastic in the master branch (probably the future version v0.9.0). The ElasticSearch Operator generates the secret resource named quickstart-es-http-certs-public.

image

vishnuhd commented 5 years ago

@secat Thanks, I will give it a try and get back to you here.

secat commented 5 years ago

@vishnuhd Here is a more simplified deployment manifests:

ElasticSearch:

apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
kind: Elasticsearch
metadata:
  name: quickstart
  namespace: observability
spec:
  version: 6.8.0
  nodes:
  - nodeCount: 1
    config:
      node.master: true
      node.data: true
      node.ingest: true

Kibana:

apiVersion: kibana.k8s.elastic.co/v1alpha1
kind: Kibana
metadata:
  name: quickstart
  namespace: observability
spec:
  version: 6.8.0
  nodeCount: 1
  elasticsearchRef:
    name: quickstart
  http:
    tls:
      selfSignedCertificate:
        disabled: true

Jaeger:

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: local-jaeger-tracing
  namespace: observability
spec:
  strategy: allInOne
  allInOne:
    image: docker.io/jaegertracing/all-in-one:1.13
    options:
      collector:
        zipkin:
          http-port: "9411"
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: https://quickstart-es-http.observability.svc:9200
        tls.ca: /etc/ssl/certs/tls.crt
    secretName: quickstart-es
  ingress:
    enabled: false
  agent:
    strategy: Sidecar
  volumeMounts:
  - name: es-tls
    mountPath: /etc/ssl/certs
  volumes:
  - name: es-tls
    secret:
      secretName: quickstart-es-http-certs-public

Example of quickstart-es secret manifest generated manually:

apiVersion: v1
kind: Secret
metadata:
  name: quickstart-es
  namespace: observability
type: Opaque
data:
  ES_USERNAME: ZWxhc3RpYw==
  ES_PASSWORD: Nm5qa21rbnZ0Z3AyZnZsaHF3OWtoN3J4 # 6njkmknvtgp2fvlhqw9kh7rx
vishnuhd commented 5 years ago

@secat Great ! Thanks. It's more than enough.

vishnuhd commented 5 years ago

The ElasticSearch Operator generates the secret resource named quickstart-es-http-certs-public

@secat, the Elastic Operator version - 0.8.1 doesn't deploy the named cert for me. It deploys :

Secrets
======
observability    quickstart-ca                                    Opaque                                1         12m
observability    quickstart-ca-private-key                        Opaque                                1         12m
observability    quickstart-elastic-user                          Opaque                                1         12m
observability    quickstart-es-2f5spqgbvh-certs                   Opaque                                3         12m
observability    quickstart-es-2f5spqgbvh-config                  Opaque                                1         12m
observability    quickstart-es-ca                                 Opaque                                1         12m
observability    quickstart-es-roles-users                        Opaque                                3         12m
observability    quickstart-extrafiles                            Opaque                                1         12m
observability    quickstart-internal-users                        Opaque                                3         12m
observability    quickstart-kibana-user                           Opaque                                1         12m
observability    quickstart-secure-settings                       Opaque                                0         12m

The quickstart-es-2f5spqgbvh-certs have following and not the one needed - tls.crt :

Data
====
ca.pem:    1164 bytes
cert.pem:  2872 bytes
csr.pem:   585 bytes

And quickstart-ca contains :

Data
====
ca.pem:  1164 bytes

Is the needed cert secret a feature of master branch ? And can we make use of quickstart-ca to make it work ?

secat commented 5 years ago

I was using the master branch (future v0.9.0) version. The quickstart-ca secret should work if the entire chain certificate is present. You should also check the SAN provided.

vishnuhd commented 5 years ago

@secat, I am having hard time setting up the TLS for Jaeger with ES backend. What I did :

  1. The Jaeger operator is deployed from master.
  2. Deploy the Elastic search operator from https://github.com/elastic/cloud-on-k8s (v0.8.1).
  3. Deployed the ES cluster :
    apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
    kind: Elasticsearch
    metadata:
    name: quickstart
    namespace: observability
    spec:
    version: 6.8.0
    nodes:
    - nodeCount: 1
    config:
      node.master: true
      node.data: true
      node.ingest: true
  4. Deployed Jaeger cluster :
    apiVersion: jaegertracing.io/v1
    kind: Jaeger
    metadata:
    name: my-jaeger
    namespace: observability
    spec:
    strategy: production
    ui:
    options:
      dependencies:
        menuEnabled: false
      tracking:
        gaID: UA-000000-2
      menu:
        - label: "About Jaeger"
          items:
            - label: "Documentation"
              url: "https://www.jaegertracing.io/docs/latest"
    storage:
    type: elasticsearch
    options:
      es:
        server-urls: https://quickstart-es.observability.svc.cluster.local:9200
        tls.ca: /etc/ssl/certs/ca.pem
    secretName: mysecret
    ingress:
    enabled: false
    agent:
    strategy: DaemonSet
    annotations:
    scheduler.alpha.kubernetes.io/critical-pod: ""
    volumeMounts:
    - name: es-tls
    mountPath: /etc/ssl/certs
    volumes:
    - name: es-tls
    secret:
      secretName: quickstart-ca

    The quickstart-ca is being deployed by ES operator having the ca.pem.

  5. The user and password for ES secret :
    apiVersion: v1
    kind: Secret
    metadata:
    name: mysecret
    namespace: observability
    type: Opaque
    data:
    ES_USERNAME: ZWxhc3RpYwo=       #Base64
    ES_PASSWORD: ejJuczdwNWJrYm5wOHF0c3dkbmJ4bjd4Cg==      #Base64

    But, both collector and query are failing with logs :

    {"level":"fatal","ts":1562781276.0102606,"caller":"collector/main.go:89","msg":"Failed to init storage factory","error":"failed to create primary Elasticsearch client: health check timeout: no Elasticsearch node available","errorVerbose":"no Elasticsearch node available\ngithub.com/jaegertracing

    The ES cluster node is having the logs :

    [2019-07-10T17:03:30,644][WARN ][o.e.h.n.Netty4HttpServerTransport] [quickstart-es-2f5spqgbvh] caught exception while handling client http traffic, closing connection [id: 0xd76cf43a, L:0.0.0.0/0.0.0.0:9200 ! R:/100.96.2.77:52448]
    io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: Received fatal alert: unknown_ca
    ...
    [2019-07-10T11:58:17,030][WARN ][o.e.h.n.Netty4HttpServerTransport] [quickstart-es-2f5spqgbvh] caught exception while handling client http traffic, closing connection [id: 0xab9df0e2, L:0.0.0.0/0.0.0.0:9200 ! R:/100.96.2.70:50728]
    io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: Received fatal alert: bad_certificate
    ...

    However when I do a curl from inside the cluster, the connection is successful :

    curl --cacert ca.pem -u elastic:$PW https://quickstart-es.observability.svc.cluster.local:9200
    {
    "name" : "quickstart-es-2f5spqgbvh",
    "cluster_name" : "quickstart",
    "cluster_uuid" : "u3BrsbGNStG4yCz28VDhGA",
    "version" : {
    "number" : "6.8.0",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "65b6179",
    "build_date" : "2019-05-15T20:06:13.172855Z",
    "build_snapshot" : false,
    "lucene_version" : "7.7.0",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
    },
    "tagline" : "You Know, for Search"
    }

    Can you please put some light on it ?

secat commented 5 years ago

@vishnuhd

Can you output the content of the certificate?

By example, the content of the tls.crt from the quickstart-es-http-certs-public secret is:

$ openssl x509 -in tls.crt -text                                                                                                                                                                                         Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            48:f9:41:c4:5d:f4:d9:00:de:18:d3:87:66:bf:3c:6e
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: OU=quickstart, CN=http-dqfq5swzbdlblw6j
        Validity
            Not Before: Jul  8 14:59:56 2019 GMT
            Not After : Jul  7 15:09:56 2020 GMT
        Subject: OU=quickstart, CN=quickstart.observability.es.local
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:d9:9f:b7:e8:ca:43:2e:ac:fa:d7:f0:a8:fe:fd:
                    dc:2a:13:c2:09:33:cb:94:9c:a4:9d:16:4e:56:21:
                    f5:6b:01:46:85:e4:0b:90:3c:2b:6a:14:d6:ab:f5:
                    27:37:b9:35:89:e2:a3:62:d1:8f:60:d9:cf:f2:b4:
                    20:72:50:8d:c5:53:bb:e0:25:3f:32:0e:ab:ea:eb:
                    0c:84:81:c5:5e:54:7c:26:28:ec:6e:1f:a6:13:6a:
                    b4:f6:32:09:f7:dc:f2:25:f0:3e:0a:65:88:06:f6:
                    1b:51:a0:c1:20:d8:52:24:fe:32:d1:83:77:8e:18:
                    f0:9d:7e:f8:44:64:00:56:99:6b:cf:28:95:cd:e2:
                    46:ac:0f:0c:05:2e:8a:0c:3a:9b:c6:f2:44:dc:8a:
                    c8:d8:78:5b:50:73:d7:d8:20:4c:12:09:27:e5:3f:
                    fb:ab:4e:e4:1f:e3:f3:4e:e6:20:80:0d:8d:70:9d:
                    71:4f:a6:03:7f:33:42:0e:27:0f:e4:b1:6c:7c:b5:
                    2b:6e:22:e8:4b:c6:91:bb:2a:a9:76:e2:73:2d:fe:
                    1d:46:dc:51:95:a2:4f:31:2f:82:66:3c:e7:af:32:
                    b4:b1:da:1e:4c:59:38:a3:31:1c:62:40:c8:42:6a:
                    9d:b0:21:d6:63:c9:d4:a6:bd:69:c9:ac:29:66:5b:
                    da:e7
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Subject Alternative Name:
                DNS:quickstart.observability.es.local, DNS:quickstart-es-http.observability.svc, DNS:quickstart-es-http.observability
    Signature Algorithm: sha256WithRSAEncryption
         89:9e:81:6d:24:5c:81:81:83:7d:ea:40:2d:ab:fb:58:bd:ae:
         1e:bb:01:8d:a0:8c:8d:fa:4c:ba:9b:6e:97:59:32:1c:00:0e:
         3a:b4:aa:0f:6d:0d:7f:82:42:87:cd:e8:fb:06:2d:0b:62:5f:
         94:3a:9c:b2:a8:62:0d:2e:f5:01:f0:6a:03:94:98:33:90:43:
         69:c4:f0:8f:0b:52:d6:1e:22:02:df:cd:50:fd:48:ef:e8:ba:
         bf:8c:a8:56:41:58:9b:f4:60:10:85:8d:1a:c7:c6:07:f2:38:
         a0:8b:05:05:50:77:ea:fa:28:99:7f:95:00:eb:da:ba:fe:55:
         2c:ee:07:50:35:45:fa:0d:ac:ae:5a:20:9f:59:cc:05:6f:ee:
         6d:83:92:50:7a:72:ac:04:75:d5:0e:86:a4:8b:70:a5:ef:96:
         25:4e:32:c3:a6:51:fb:44:6b:f3:30:d0:93:b1:97:51:61:17:
         37:ca:2c:8f:19:2b:1f:4b:a1:65:4c:b6:27:12:c4:3c:ff:31:
         37:f3:56:ca:62:bb:e6:54:a5:67:f9:d8:ef:05:2d:d2:d3:4f:
         55:0f:97:68:70:84:0c:7d:92:09:c8:d8:e1:5e:b7:97:ee:d5:
         77:07:c0:93:73:76:36:ba:ec:f7:f6:89:bd:3d:96:a7:9b:ce:
         8d:c9:03:de
vishnuhd commented 5 years ago

@secat The content of ca.pem from quickstart-ca is :

$ openssl x509 -in ca.pem -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            43:cd:40:7a:7d:3a:1d:7e:b5:41:13:d3:2c:c2:b1:a2
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: OU=xlzjchb86bg2zrzg, CN=quickstart
        Validity
            Not Before: Jul 11 08:19:13 2019 GMT
            Not After : Jul 10 08:20:13 2020 GMT
        Subject: OU=xlzjchb86bg2zrzg, CN=quickstart
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:c9:e5:b1:70:ab:ba:fe:15:73:79:70:f2:eb:34:
                    4c:ef:de:53:f3:4d:cd:a9:11:f4:0b:2c:c1:5e:68:
                    17:c1:43:68:6d:ee:89:08:b1:ef:a7:a9:27:26:69:
                    85:50:35:82:21:b6:f7:4d:1b:b6:3b:00:a3:81:d8:
                    df:9a:44:68:f7:29:74:d6:92:a9:0b:34:25:dc:b5:
                    7b:f8:91:3c:51:8a:4e:46:cf:a2:f8:fb:c6:f9:77:
                    22:1b:5d:37:10:eb:60:b5:34:af:08:23:bd:16:c2:
                    e4:1b:42:9c:3d:93:41:a1:14:28:fb:ed:e1:9f:a4:
                    50:08:4b:ec:bb:d5:88:5c:61:93:4e:2b:ae:f5:30:
                    ce:1c:dc:37:4d:5f:c9:3d:90:ed:6c:42:5f:55:54:
                    19:b3:4d:aa:c8:bc:7c:09:c7:07:15:2e:c1:dd:e3:
                    28:3b:0b:55:cb:7f:79:1c:30:29:95:c8:d9:65:a7:
                    b4:6b:3c:f5:99:ec:1a:23:e8:ef:78:43:11:2e:bc:
                    47:52:69:0f:d5:c4:17:0b:eb:b8:1c:7b:0e:03:b9:
                    78:60:a5:0a:40:d2:4f:ac:c8:73:f1:4e:26:b1:53:
                    9c:37:60:32:6a:7b:79:93:d8:8d:6e:99:e4:40:c4:
                    e9:18:04:f6:06:51:4d:bc:f7:05:fa:e7:c1:23:68:
                    ff:75
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Certificate Sign
            X509v3 Extended Key Usage:
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:TRUE
    Signature Algorithm: sha256WithRSAEncryption
         ab:a0:af:9a:99:4b:3c:6a:aa:78:26:3f:e1:29:2e:7b:6a:14:
         6a:ce:dc:62:41:70:d9:a6:8f:18:ca:9e:72:6e:8a:bb:73:9d:
         d9:b0:ff:ff:c4:6d:65:4e:05:bf:e2:5d:65:92:33:94:01:ca:
         01:16:4b:91:a8:2b:85:d9:7a:b3:8b:3c:9c:c7:fe:9c:14:3c:
         f5:62:a9:19:94:3f:12:e7:06:a2:55:f9:fe:18:25:ce:fe:b8:
         e6:32:04:6b:87:d7:a3:4e:dd:3e:d7:23:03:7b:d1:8b:58:ff:
         26:1e:0b:84:cc:dc:c6:ed:89:c6:87:96:d0:9e:fa:ad:5c:d6:
         37:39:1f:74:fb:18:ef:f5:1e:fc:21:e8:ba:ed:ee:ff:76:a7:
         e4:57:87:50:40:3f:32:ec:9c:fc:cd:68:d9:b6:b1:f6:e5:18:
         de:1c:04:d2:cc:40:20:f2:94:c9:a8:02:af:b8:df:2d:86:39:
         1f:01:55:b9:fa:87:0d:8e:ee:cc:a8:db:a7:09:d5:b2:f8:c3:
         8e:b0:da:ba:89:dc:56:3d:43:d1:e8:82:0d:98:80:7a:49:64:
         e5:73:10:09:08:cf:28:24:f4:cd:80:9f:0a:b0:ae:c3:9a:e1:
         3e:0c:0d:a6:78:1f:e0:33:0a:4d:94:c4:cb:30:ed:5f:fc:07:
         81:d0:6e:f1
-----BEGIN CERTIFICATE-----
MIIDLDCCAhSgAwIBAgIQQ81Aen06HX61QRPTLMKxojANBgkqhkiG9w0BAQsFADAw
MRkwFwYDVQQLExB4bHpqY2hiODZiZzJ6cnpnMRMwEQYDVQQDEwpxdWlja3N0YXJ0
MB4XDTE5MDcxMTA4MTkxM1oXDTIwMDcxMDA4MjAxM1owMDEZMBcGA1UECxMQeGx6
amNoYjg2YmcyenJ6ZzETMBEGA1UEAxMKcXVpY2tzdGFydDCCASIwDQYJKoZIhvcN
AQEBBQADggEPADCCAQoCggEBAMnlsXCruv4Vc3lw8us0TO/eU/NNzakR9AsswV5o
F8FDaG3uiQix76epJyZphVA1giG2900btjsAo4HY35pEaPcpdNaSqQs0Jdy1e/iR
PFGKTkbPovj7xvl3IhtdNxDrYLU0rwgjvRbC5BtCnD2TQaEUKPvt4Z+kUAhL7LvV
iFxhk04rrvUwzhzcN01fyT2Q7WxCX1VUGbNNqsi8fAnHBxUuwd3jKDsLVct/eRww
KZXI2WWntGs89ZnsGiPo73hDES68R1JpD9XEFwvruBx7DgO5eGClCkDST6zIc/FO
JrFTnDdgMmp7eZPYjW6Z5EDE6RgE9gZRTbz3BfrnwSNo/3UCAwEAAaNCMEAwDgYD
VR0PAQH/BAQDAgKEMB0GA1UdJQQWMBQGCCsGAQUFBwMBBggrBgEFBQcDAjAPBgNV
HRMBAf8EBTADAQH/MA0GCSqGSIb3DQEBCwUAA4IBAQCroK+amUs8aqp4Jj/hKS57
ahRqztxiQXDZpo8Yyp5yboq7c53ZsP//xG1lTgW/4l1lkjOUAcoBFkuRqCuF2Xqz
izycx/6cFDz1YqkZlD8S5waiVfn+GCXO/rjmMgRrh9ejTt0+1yMDe9GLWP8mHguE
zNzG7YnGh5bQnvqtXNY3OR90+xjv9R78Iei67e7/dqfkV4dQQD8y7Jz8zWjZtrH2
5RjeHATSzEAg8pTJqAKvuN8thjkfAVW5+ocNju7MqNunCdWy+MOOsNq6idxWPUPR
6IINmIB6SWTlcxAJCM8oJPTNgJ8KsK7DmuE+DA2meB/gMwpNlMTLMO1f/AeB0G7x
-----END CERTIFICATE-----
secat commented 5 years ago

There is no X509v3 Subject Alternative Name section in the certificate...

vishnuhd commented 5 years ago

How do I include it in, this is something provided by the ES operator.

secat commented 5 years ago

You should try to add SAN in the elasticsearch custom resource (see https://www.elastic.co/guide/en/cloud-on-k8s/master/k8s-accessing-elastic-services.html#k8s-static-ip-custom-domain)

omerfsen commented 4 years ago

I had the same issue with 1.18 (right now for helm there is 1.18 but for other installation types there is 1.19) I have installed 1.18 helm chart and then manually changed image of jager-ingester and jaeger-query deployment and it magically worked ;) after using 1.19.0 image. By the way i only used ca:

like:

....
        tls:
          enabled: yes
          ca: /es/certificates/ca.crt
...
herbguo commented 3 years ago

Hello~ I had the same issue for the jaeger-collector compent, were you solve the problem?

{"level":"fatal","ts":1603192251.3967118,"caller":"command-line-arguments/main.go:70","msg":"Failed to init storage factory","error":"failed to create primary Elasticsearch client: health check timeout: no Elasticsearch node available","stacktrace":"main.main.func1\n\tcommand-line-arguments/main.go:70\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/cobra@v0.0.3/command.go:762\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/cobra@v0.0.3/command.go:852\ngithub.com/spf13/cobra.(*Command).Execute\n\tgithub.com/spf13/cobra@v0.0.3/command.go:800\nmain.main\n\tcommand-line-arguments/main.go:129\nruntime.main\n\truntime/proc.go:204"}

priyavj08 commented 3 years ago

@secat @vishnuhd I have deployed Elasticsearch using redhat operator on openshift cloud platform 4.6. In my deployment which is in namespace openshift-logging, these are the secrets I see


builder-token-kp24f                        kubernetes.io/service-account-token   4      6d20h
builder-token-m45rb                        kubernetes.io/service-account-token   4      6d20h
cluster-logging-operator-dockercfg-fhz87   kubernetes.io/dockercfg               1      6d20h
cluster-logging-operator-token-jpxwr       kubernetes.io/service-account-token   4      6d20h
cluster-logging-operator-token-w847d       kubernetes.io/service-account-token   4      6d20h
curator                                    Opaque                                6      5d17h
curator-dockercfg-cq8l6                    kubernetes.io/dockercfg               1      5d17h
curator-token-74pjx                        kubernetes.io/service-account-token   4      5d17h
curator-token-t76bd                        kubernetes.io/service-account-token   4      5d17h
default-dockercfg-wrgcb                    kubernetes.io/dockercfg               1      6d20h
default-token-nr599                        kubernetes.io/service-account-token   4      6d20h
default-token-ptmdj                        kubernetes.io/service-account-token   4      6d20h
deployer-dockercfg-jb7v8                   kubernetes.io/dockercfg               1      6d20h
deployer-token-t8qhb                       kubernetes.io/service-account-token   4      6d20h
deployer-token-tk8j7                       kubernetes.io/service-account-token   4      6d20h
elasticsearch                              Opaque                                7      5d17h
elasticsearch-dockercfg-drsg5              kubernetes.io/dockercfg               1      5d17h
elasticsearch-metrics                      kubernetes.io/tls                     2      5d17h
elasticsearch-token-4tb4f                  kubernetes.io/service-account-token   4      5d17h
elasticsearch-token-7jdln                  kubernetes.io/service-account-token   4      5d17h
fluentd                                    Opaque                                3      5d17h
fluentd-metrics                            kubernetes.io/tls                     2      5d17h
istio.builder                              istio.io/key-and-cert                 3      6d20h
istio.cluster-logging-operator             istio.io/key-and-cert                 3      6d20h
istio.curator                              istio.io/key-and-cert                 3      5d17h
istio.default                              istio.io/key-and-cert                 3      6d20h
istio.deployer                             istio.io/key-and-cert                 3      6d20h
istio.elasticsearch                        istio.io/key-and-cert                 3      5d17h
istio.kibana                               istio.io/key-and-cert                 3      5d17h
istio.logcollector                         istio.io/key-and-cert                 3      5d17h
kibana                                     Opaque                                3      5d17h
kibana-dockercfg-bbz5b                     kubernetes.io/dockercfg               1      5d17h
kibana-proxy                               Opaque                                3      5d17h
kibana-token-kqtx5                         kubernetes.io/service-account-token   4      5d17h
kibana-token-r5w4l                         kubernetes.io/service-account-token   4      5d17h
logcollector-dockercfg-28g9x               kubernetes.io/dockercfg               1      5d17h
logcollector-token-999t6                   kubernetes.io/service-account-token   4      5d17h
logcollector-token-pt27f                   kubernetes.io/service-account-token   4      5d17h
master-certs                               Opaque                                2      5d17h

here is my jaeger config
```apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: afjaeger
spec:
  strategy: production
  collector:
    maxReplicas: 1
    resources:
      limits:
        cpu: 400m
        memory: 512Mi
  query:
    resources:
      limits:
        cpu: 256m
        memory: 128Mi
  agent:
    strategy: DaemonSet
    resources:
      limits:
        cpu: 256m
        memory: 128Mi       
  storage:
    type: elasticsearch
    elasticsearch:
      nodeCount: 3
      resources:
        requests:
          cpu: 500m
          memory: 1Gi
        limits:
          cpu: 1
          memory: 2Gi   
    options:
      es:
         server-urls: https://elasticsearch.openshift-logging.svc.cluster.local:9200
         tls.ca: /es/certificates/tls.crt    
    secretName: jaeger-secret-ka
  volumeMounts:
    - name: certificates
      mountPath: /es/certificates/
      readOnly: true
  volumes:
    - name: certificates
      secret:
        secretName: test

I tried to use admin-ca from elasticsearch secret. 

looks like you both got it work. appreciate your help on this.

thanks
Priya
jpkrohling commented 3 years ago

@priyavj08, you should probably not use the same ES instance that is used for the cluster logging. As you are using it in OpenShift, the operator is able to provision a ES cluster for you.

jpkrohling commented 3 years ago

P.S.: you should probably send your inquiries to the Red Hat support, as we have no SLA for answering questions here in the upstream community.

priyavj08 commented 3 years ago

@jpkrohling thanks for your reply. jaeger operator provisioned ES works fine for me but I have a requirement where I will have to use the existing elasticsearch.

according to https://www.jaegertracing.io/docs/1.15/operator/#elasticsearch-storage documentation it is enough to pass server certificate but in my case it is not clear which cert to use.

As suggested by you, I will also file a ticket with red hat. If you have some idea please let me know

thanks

jpkrohling commented 3 years ago

In your CR, you should probably omit the elasticsearch node, as those properties are for the auto-provisioning. About the certs, you'd need to specify the client cert, in addition to the service-ca, as there's mutual TLS auth in place.

@objectiser can correct me if I'm wrong, but keep in mind that using the existing logging ES instance is unsupported and should be avoided. The reason is that you don't want your workload-specific data (traces) to influence the logging mechanism for your platform (OpenShift cluster).

n1vgabay commented 2 years ago

Hey guys, I literally read all comments here but still can't figure out how to enable collector and query to use TLS to communicate with my elastic search as my DB.

For your feedback: There is anything I'm doing wrong? URL of elastic inside cluster is ok? I stored .p12 file in secrets, should I use it for jeager deploys instead of pem?

My code is like this of jaeger:

Please any help will be apprectaie, I would love to understand for future cases how should I debugg services like this.

You see in those pics all values for configure jaeger helmchart.

Error is exactly like people mentioned before here, {"level":"info","ts":1640082555.3463783,"caller":"flags/service.go:117","msg":"Mounting metrics handler on admin server","route":"/metrics"} {"level":"info","ts":1640082555.3464534,"caller":"flags/service.go:123","msg":"Mounting expvar handler on admin server","route":"/debug/vars"} {"level":"info","ts":1640082555.3470142,"caller":"flags/admin.go:105","msg":"Mounting health check on admin server","route":"/"} {"level":"info","ts":1640082555.347095,"caller":"flags/admin.go:111","msg":"Starting admin HTTP server","http-addr":":14269"} {"level":"info","ts":1640082555.3471355,"caller":"flags/admin.go:97","msg":"Admin server started","http.host-port":"[::]:14269","health-status":"unavailable"} {"level":"fatal","ts":1640082560.4366167,"caller":"command-line-arguments/main.go:75","msg":"Failed to init storage factory","error":"failed to create primary Elasticsearch client: health check timeout: Head \"https://elasticsearch-master:9200\": x509: certificate signed by unknown authority: no Elasticsearch node available","stacktrace":"main.main.func1\n\tcommand-line-arguments/main.go:75\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/cobra@v0.0.7/command.go:838\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/cobra@v0.0.7/command.go:943\ngithub.com/spf13/cobra.(*Command).Execute\n\tgithub.com/spf13/cobra@v0.0.7/command.go:883\nmain.main\n\tcommand-line-arguments/main.go:137\nruntime.main\n\truntime/proc.go:204"}

Any help will be appreciated.

Niv

nvivo commented 11 months ago

It's 2023 and I've been struggling with this error for the entire day, so I'm documenting here what I did. This answer is up there somewhere, but I didn't see it at first, so here we go:

After many attempts to set this up, the issue was not the TLS, but elastic authentication which was wrong. More precisely, I was creating an API KEY instead of using the password generated by ECK.

After creating the elasticseach 7 cluster with ECK, there is a secret named something like clustername-es-elastic-user. Copy that password to a new secret with keys ES_USERNAME and ES_PASSWORD.

kubectl create secret generic jaeger-es-credentials --from-literal=ES_PASSWORD=XXXXX --from-literal=ES_USERNAME=elastic

Then, just point to the secret and it should work:

    secretName: jaeger-es-credentials
    options:
      es:
        server-urls: https://elastic7-es-http:9200
        tls:
          ca: /es/certificates/ca.crt
khsriharikota commented 2 months ago

This is issue with TLS certificate setup. Follow this steps to resolve this issue

Generate a new secret for Jaeger in the namespace by using the existing elastic user secret.

PASSWORD=$(kubectl get secret <secret-name>  -n elastic-system -o=jsonpath='{.data.elastic}' | base64 --decode)
kubectl create secret generic jaeger-es-secret -n observability --from-literal=ES_PASSWORD=${PASSWORD} --from-literal=ES_USERNAME=elastic

Check created new secret

kubectl get secrets -n observability

Copy the existing TLS CA (Elastic public certificates) for use with Jaeger. kubectl get secret trace-es-http-certs-public -n elastic-system -o yaml > trace-es-http-certs-public.yaml Open the trace-es-http-certs-public.yaml file and remove the namespace, uid, version, and any other unnecessary fields.

Then apply

kubectl apply -f trace-es-es-http-certs-public.yaml -n observability

spec:
  strategy: production
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: https://trace-es-http.elastic-system.svc:9200
        tls:
          enabled: true
          ca: /es/certificates/ca.crt
    secretName: jaeger-es-secret
  volumeMounts:
    - name: certificates
      mountPath: /es/certificates/
      readOnly: true
  volumes:
    - name: certificates
      secret:
        secretName: trace-es-http-certs-public