elastic / cloud-on-k8s

Elastic Cloud on Kubernetes
Other
2.52k stars 686 forks source link

Traffic splitting, elasticsearchRef.serviceName, and certificate validation #6165

Open barkbay opened 1 year ago

barkbay commented 1 year ago

The traffic splitting documentation does not mention that referencing a custom Service using elasticsearchRef.serviceName does not update the HTTP certificate exposed by the Elasticsearch nodes.

It works for Kibana as documented because the verificationMode is set to certificate by the Kibana controller:

func elasticsearchTLSSettings(esAssocConf commonv1.AssociationConf) map[string]interface{} {
    cfg := map[string]interface{}{
        ElasticsearchSslVerificationMode: "certificate",
    }

But in the case of Beats, and maybe other stack applications (see https://github.com/elastic/cloud-on-k8s/issues/4812 about this inconsistency), a full validation is done, which leads to a validation failure:

{
    "log.level": "error",
    "@timestamp": "2022-11-10T08:43:27.961Z",
    "log.logger": "publisher_pipeline_output",
    "log.origin": {
        "file.name": "pipeline/client_worker.go",
        "file.line": 150
    },
    "message": "Failed to connect to backoff(elasticsearch(https://my-ingest-nodes.default.svc:9200)): Get \"https://my-ingest-nodes.default.svc:9200\": x509: certificate is valid for my-es-http.default.es.local, my-es-http, my-es-http.default.svc, my-es-http.default, my-es-internal-http.default.svc, my-es-internal-http.default, *.my-es-master.default.svc, *.my-es-data.default.svc, *.my-es-ingest-1.default.svc, *.my-es-ingest-2.default.svc, not my-ingest-nodes.default.svc",
    "service.name": "filebeat",
    "ecs.version": "1.6.0"
}

I think we should at least document the workaround of adding the expected hostname in the SAN extension:

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: my-es
spec:
  version: x.x.x
  nodeSets: {}
  http:
    tls:
      selfSignedCertificate:
        subjectAltNames:
          - dns: my-ingest-nodes.default.svc

I also think an improvement could be that the Elasticsearch controller automatically adds the expected hostname to the SAN extension?

kunisen commented 1 year ago

Thanks @barkbay for raising the issue! Do you think it makes sense to throw some warning in ECK operator, if a user specifies a headless service in SAN part?

barkbay commented 1 year ago

I'm not sure it is easy to detect that a DNS name refers to a headless Service. As mentioned in the the k8s documentation the DNS record for a Service may end with the cluster domain name which is not something we can detect in the operator (this is at least not something we have today). We could still try to parse the name, and detect if the first segments match one of the headless Services name, but it seems a bit involved for a small improvement? Also we would have to decide if we do the same for Pods DNS records.

I would first update our documentation to mention why we consider it is deprecated to use them.