opensearch-project / opensearch-k8s-operator

OpenSearch Kubernetes Operator
Apache License 2.0
384 stars 201 forks source link

Feature: Support cross cluster search #428

Open rhysxevans opened 1 year ago

rhysxevans commented 1 year ago

Hi

I am not entirely sure if I am doing something wrong, but here we go.

Can we get cross cluster search enabled/supported via the operator?

At present, I think the main issue I am having is around the certificates and the per cluster CA, but I could be wrong. It is my understanding that if the nodes trusted multiple CA's or the CA was optionally "global" to the operator. That would solve my issue, but there may need to be some work around the configuration.

My use case is each team has individual clusters with dashboarding optionally, the there is also an overarching governance function that would have access to the data in all the clusters. Obviously the other option is single dashboard service, using rbac to secure access to the relevant clusters

Any thoughts as to whether this would be a possibility?

I will say I am not a coder, but am willing to help test etc where I can

Thanks

swoehrl-mw commented 1 year ago

Hi @rhysxevans. I have never dealt with the cross-cluster search feature so I don't really know how it works internally. So I'm not sure what would be needed to support it in the operator. Your guess about the certificates and CA sounds good though. The only way to currently support this would probably be to provide your own certificates with your own CA.

Supporting cross-cluster search is something we can do in the operator, but IMO its not a priority, so this will have to wait for a contributor who knows what needs to be implemented.

rhysxevans commented 1 year ago

Hi @swoehrl-mw

Thanks for the response.

I have managed to get this working, obviously not through the operator, but listing process below, for info

1) Create a single CA (not necessarily necessary but easier, as provides trust path across all clusters) 2) For each cluster, create the relevant certificates as per your docs (note Admin private key needs to be pkcs8 format) - (In my testing I used single cert per cluster for transport, just calling out here if per node doesn't work for some reason) 3) Ensure the nodes in the clusters have the "remote_cluster_client" opensearch role 4) The cross cluster search cluster must have each clusters certificate DN's in the nodeDNs setting along with its own 5) The clusters to be searched should have their own cert DN and the cross cluster search cluster cert DN in the nodeDNs 6) on clusters to be searched (I had 2)

kubectl port-forward pod/first-cluster-masters-0 9200:9200 -n opensearch-first-cluster

curl -XPUT -k -H 'Content-Type: application/json' -u 'admin:admin' 'https://localhost:9200/books/_doc/1' -d '{"Dracula the First": "Bram Stoker"}'

kubectl port-forward pod/second-cluster-masters-0 9200:9200 -n opensearch-second-cluster

curl -XPUT -k -H 'Content-Type: application/json' -u 'admin:admin' 'https://localhost:9200/books/_doc/1' -d '{"Dracula the Second": "Bram Stoker"}'

7) On cross cluster search cluster

PUT _cluster/settings
{
  "persistent": {
    "cluster": {
      "remote": {
        "first-cluster": {
          "mode": "sniff",
          "seeds": ["first-cluster.opensearch-first-cluster.svc.cluster.local:9300"],
          "skip_unavailable": true
        },
        "second-cluster": {
          "mode": "sniff",
          "seeds": ["second-cluster.opensearch-second-cluster.svc.cluster.local:9300"],
          "skip_unavailable": true
        }
      }
    }
  }
}

GET _remote/info # should show that the remote clusters are connected

GET _cluster/settings

GET first-cluster:books/_search?pretty

GET second-cluster:books/_search?pretty
partymaker-py commented 1 year ago

Hi @rhysxevans

Can you please show how to correctly mount already generated certs to conatiner and make a few opensearch clusters to trust them? It will help me and other people a lot

Thanks

rhysxevans commented 12 months ago

Hi

The documentation can be found here https://github.com/Opster/opensearch-k8s-operator/blob/main/docs/userguide/main.md#tls amongst others.

Note all the certs for the various clusters my be either signed by the same CA and that CA must be trusted, or each CA must be trusted across all the required pods, if you use multiple CA's

I provision my infra using terraform so some code below (and note it is by no means perfect)

CA cert creation - Operator install

main.tf

module "ca" {
  source   = "../certificates/ca"
  private_key_algorithm  = "RSA"
  private_key_rsa_bits   = 2048
  validity_period_hours  = 43800
  ca_common_name         = "opensearch-operator-ca.${var.cluster_name}"
  organization_name      = "opensearch-${var.cluster_name}"
  #ca_public_key_path     = "opensearch-operator-ca.${var.cluster_name}.crt"
  depends_on = [
    helm_release.opensearch_operator
  ]
}

output "ca_key_algorithm" {
  value = module.ca.ca_key_algorithm
}

output "ca_private_key_pem" {
  value = module.ca.ca_private_key_pem
}

output "ca_cert_pem" {
  value = module.ca.ca_cert_pem
}

certificates\ca\vars.tf

variable "private_key_algorithm" {
  description = "The name of the algorithm to use for private keys. Must be one of: RSA or ECDSA."
  default     = "RSA"
}

variable "private_key_rsa_bits" {
  description = "The size of the generated RSA key in bits. Should only be used if var.private_key_algorithm is RSA."
  default     = 2048
}

variable "private_key_ecdsa_curve" {
  description = "The name of the elliptic curve to use. Should only be used if var.private_key_algorithm is ECDSA. Must be one of P224, P256, P384 or P521."
  default     = "P256"
}

variable "validity_period_hours" {
  description = "The number of hours after initial issuing that the certificate will become invalid."
  default     = 8760   #8760 = 1 year, 43800 = 5 years, 8760 = 10 years
}

variable "ca_common_name" {
  description = "The common name to use in the subject of the CA certificate (e.g. acme.co cert)."
  default     = "example.com"
}

variable "organization_name" {
  description = "The name of the organization to associate with the certificates (e.g. Acme Co)."
  default     = "Example Organization"
}

variable "ca_allowed_uses" {
  description = "List of keywords from RFC5280 describing a use that is permitted for the CA certificate. For more info and the list of keywords, see https://www.terraform.io/docs/providers/tls/r/self_signed_cert.html#allowed_uses."
  type        = list(string)

  default = [
    "cert_signing",
    "key_encipherment",
    "digital_signature",
  ]
}

certificates\ca\main.tf

resource "tls_private_key" "ca" {
  algorithm   = var.private_key_algorithm
  rsa_bits    = var.private_key_rsa_bits
  ecdsa_curve = var.private_key_ecdsa_curve
}

resource "tls_self_signed_cert" "ca" {
  private_key_pem       = tls_private_key.ca.private_key_pem
  is_ca_certificate     = true
  validity_period_hours = var.validity_period_hours
  allowed_uses          = var.ca_allowed_uses

  subject {
    common_name  = var.ca_common_name
    organization = var.organization_name
  }
}

certificates\ca\outputs.tf

output "ca_key_algorithm" {
  value = tls_private_key.ca.algorithm
}

output "ca_private_key_pem" {
  value = tls_private_key.ca.private_key_pem
}

output "ca_cert_pem" {
  value = tls_self_signed_cert.ca.cert_pem
}

Cert creation - Cluster creation

main.tf

module "node_transport" {
  for_each = var.opensearch_config
  source   = "../certificates/leaf"
  private_key_algorithm  = "RSA"
  private_key_rsa_bits   = 2048
  validity_period_hours  = 8760
  common_name            = "opensearch-${each.key}-transport"
  organization_name      = "opensearch-${var.cluster_name}-${each.key}"
  dns_names              = ["${each.key}",
                            "${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}",
                            "${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc",
                            "${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc.cluster.local",
                            "${each.key}-discovery",
                            "${each.key}-bootstrap-0",
                            "${each.key}-bootstrap-0.${each.key}",
                            "${each.key}-bootstrap-0.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}",
                            "${each.key}-bootstrap-0.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc",
                            "${each.key}-bootstrap-0.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc.cluster.local",
                            "${each.key}-masters-0",
                            "${each.key}-masters-0.${each.key}",
                            "${each.key}-masters-0.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}",
                            "${each.key}-masters-0.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc",
                            "${each.key}-masters-0.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc.cluster.local",
                            "${each.key}-masters-1",
                            "${each.key}-masters-1.${each.key}",
                            "${each.key}-masters-1.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}",
                            "${each.key}-masters-1.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc",
                            "${each.key}-masters-1.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc.cluster.local",
                            "${each.key}-masters-2",
                            "${each.key}-masters-2.${each.key}",
                            "${each.key}-masters-2.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}",
                            "${each.key}-masters-2.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc",
                            "${each.key}-masters-2.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc.cluster.local",
                            "${each.key}-masters-3",
                            "${each.key}-masters-3.${each.key}",
                            "${each.key}-masters-3.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}",
                            "${each.key}-masters-3.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc",
                            "${each.key}-masters-3.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc.cluster.local",
                            "${each.key}-masters-4",
                            "${each.key}-masters-4.${each.key}",
                            "${each.key}-masters-4.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}",
                            "${each.key}-masters-4.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc",
                            "${each.key}-masters-4.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc.cluster.local",
                            "${each.key}-masters-5",
                            "${each.key}-masters-5.${each.key}",
                            "${each.key}-masters-5.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}",
                            "${each.key}-masters-5.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc",
                            "${each.key}-masters-5.${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc.cluster.local"
                           ]
  ca_key_algorithm       = var.ca_key_algorithm
  ca_private_key_pem     = var.ca_private_key_pem
  ca_cert_pem            = var.ca_cert_pem
}

resource "kubernetes_secret_v1" "node_transport" {
  for_each = var.opensearch_config
  metadata {
    name      = "${each.key}-opensearch-node-transport-cert"
    namespace = kubernetes_namespace_v1.this[each.key].metadata[0].name
  }

  data = {
    "ca.crt"   = var.ca_cert_pem
    "tls.key"  = module.node_transport[each.key].private_key_pem
    "tls.crt"  = module.node_transport[each.key].cert_pem
  }

  type = "kubernetes.io/tls"
}

module "node_http" {
  for_each = var.opensearch_config
  source   = "../certificates/leaf"
  private_key_algorithm  = "RSA"
  private_key_rsa_bits   = 2048
  validity_period_hours  = 8760
  common_name            = "opensearch-${each.key}-http"
  organization_name      = "opensearch-${var.cluster_name}-${each.key}"
  dns_names              = ["${each.key}",
                            "${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}",
                            "${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc",
                            "${each.key}.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc.cluster.local",
                            "${each.key}-discovery",
                           ]
  ca_key_algorithm       = var.ca_key_algorithm
  ca_private_key_pem     = var.ca_private_key_pem
  ca_cert_pem            = var.ca_cert_pem
}

resource "kubernetes_secret_v1" "node_http" {
  for_each = var.opensearch_config
  metadata {
    name      = "${each.key}-opensearch-node-http-cert"
    namespace = kubernetes_namespace_v1.this[each.key].metadata[0].name
  }

  data = {
    "ca.crt"   = var.ca_cert_pem
    "tls.key"  = module.node_http[each.key].private_key_pem
    "tls.crt" = module.node_http[each.key].cert_pem
  }

  type = "kubernetes.io/tls"
}

module "node_admin" {
  for_each = var.opensearch_config
  source   = "../certificates/leaf"
  private_key_algorithm  = "RSA"
  private_key_rsa_bits   = 2048
  validity_period_hours  = 8760
  common_name            = "opensearch-${each.key}-admin"
  organization_name      = "opensearch-${var.cluster_name}-${each.key}"
  dns_names              = []
  ca_key_algorithm       = var.ca_key_algorithm
  ca_private_key_pem     = var.ca_private_key_pem
  ca_cert_pem            = var.ca_cert_pem
}

resource "kubernetes_secret_v1" "node_admin" {
  for_each = var.opensearch_config
  metadata {
    name      = "${each.key}-opensearch-admin-cert"
    namespace = kubernetes_namespace_v1.this[each.key].metadata[0].name
  }

  data = {
    "ca.crt"   = var.ca_cert_pem
    "tls.key"  = module.node_admin[each.key].private_key_pem_pkcs8
    "tls.crt"  = module.node_admin[each.key].cert_pem
  }

  type = "kubernetes.io/tls"
}

module "dashboards" {
  for_each = var.opensearch_config
  source   = "../certificates/leaf"
  private_key_algorithm  = "RSA"
  private_key_rsa_bits   = 2048
  validity_period_hours  = 8760
  common_name            = "opensearch-${each.key}-dashboards"
  organization_name      = "opensearch-${var.cluster_name}-${each.key}"
  dns_names              = ["${each.key}-dashboards",
                            "${each.key}-dashboards.${kubernetes_namespace_v1.this[each.key].metadata[0].name}",
                            "${each.key}-dashboards.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc",
                            "${each.key}-dashboards.${kubernetes_namespace_v1.this[each.key].metadata[0].name}.svc.cluster.local",
                           ]
  ca_key_algorithm       = var.ca_key_algorithm
  ca_private_key_pem     = var.ca_private_key_pem
  ca_cert_pem            = var.ca_cert_pem
}

resource "kubernetes_secret_v1" "dashboards" {
  for_each = var.opensearch_config
  metadata {
    name      = "${each.key}-opensearch-dashboards-cert"
    namespace = kubernetes_namespace_v1.this[each.key].metadata[0].name
  }

  data = {
    "ca.crt"  = var.ca_cert_pem
    "tls.key" = module.dashboards[each.key].private_key_pem
    "tls.crt" = module.dashboards[each.key].cert_pem
  }

  type = "kubernetes.io/tls"
}

certificates\leaf\vars.tf

variable "private_key_algorithm" {
  description = "The name of the algorithm to use for private keys. Must be one of: RSA or ECDSA."
  default     = "RSA"
}

variable "private_key_rsa_bits" {
  description = "The size of the generated RSA key in bits. Should only be used if var.private_key_algorithm is RSA."
  default     = 2048
}

variable "private_key_ecdsa_curve" {
  description = "The name of the elliptic curve to use. Should only be used if var.private_key_algorithm is ECDSA. Must be one of P224, P256, P384 or P521."
  default     = "P224"
}

variable "validity_period_hours" {
  description = "The number of hours after initial issuing that the certificate will become invalid."
  default     = 8760
}

variable "organization_name" {
  description = "The name of the organization to associate with the certificates (e.g. Acme Co)."
  default     = "Example Organization"
}

variable "common_name" {
  description = "The common name to use in the subject of the certificate (e.g. acme.co cert)."
}

variable "dns_names" {
  description = "List of DNS names for which the certificate will be valid (e.g. foo.example.com)."
  type        = list(string)
}

variable "ca_key_algorithm" {
  description = "The name of Algorithm used for CA key"
}

variable "ca_private_key_pem" {
  description = "Private key pem of CA"
}

variable "allowed_uses" {
  description = "List of keywords from RFC5280 describing a use that is permitted for the issued certificate. For more info and the list of keywords, see https://www.terraform.io/docs/providers/tls/r/self_signed_cert.html#allowed_uses."
  type        = list(string)

  default = [
    "key_encipherment",
    "digital_signature",
    "server_auth",
    "client_auth",
  ]
}

variable "ca_cert_pem" {
  description = "Cert PEM of CA"
}

certificates\leaf\outputs.tf

output "private_key_pem" {
  value = tls_private_key.cert.private_key_pem
}

output "private_key_pem_pkcs8" {
  value = tls_private_key.cert.private_key_pem_pkcs8
}

output "cert_pem" {
  value = tls_locally_signed_cert.cert.cert_pem
}

certificates\leaf\main.tf

resource "tls_private_key" "cert" {
  algorithm   = var.private_key_algorithm
  ecdsa_curve = var.private_key_ecdsa_curve
  rsa_bits    = var.private_key_rsa_bits
}

resource "tls_cert_request" "cert" {
  private_key_pem = tls_private_key.cert.private_key_pem

  dns_names = var.dns_names

  subject {
    common_name  = var.common_name
    organization = var.organization_name
  }
}

resource "tls_locally_signed_cert" "cert" {
  cert_request_pem = tls_cert_request.cert.cert_request_pem

  ca_private_key_pem = var.ca_private_key_pem
  ca_cert_pem        = var.ca_cert_pem

  validity_period_hours = var.validity_period_hours
  allowed_uses          = var.allowed_uses
}

within my nodepool definition I have

      "security" = {
        "config" = {
          "adminCredentialsSecret" = {
            "name" = "${each.key}-opensearch-admin-password"
          }
          "adminSecret" = {
            "name" = "${each.key}-opensearch-admin-cert"
          }
          "securityConfigSecret" = {
            "name" = "${each.key}-securityconfig"
          }
        }
        "tls" = {
          "http" = {
            "generate" = false
            "secret" = {
              "name" = "${each.key}-opensearch-node-http-cert"
            }
          }
          "transport" = {
            "generate" = false
            "perNode" = false
            "secret" = {
              "name" = "${each.key}-opensearch-node-transport-cert"
            }
            "nodesDn" = length(var.remote_clusters) > 0 ? concat([for s in var.remote_clusters : format("CN=opensearch-%s-transport,O=opensearch-${var.cluster_name}-%s", s, s)], ["CN=opensearch-${each.key}-transport,O=opensearch-${var.cluster_name}-${each.key}"]) : ["CN=opensearch-${each.key}-transport,O=opensearch-${var.cluster_name}-${each.key}","CN=opensearch-cross-cluster-search-transport,O=opensearch-${var.cluster_name}-cross-cluster-search"]
            "adminDn" = ["CN=opensearch-${each.key}-admin,O=opensearch-${var.cluster_name}-${each.key}"]
          }

Within my dashboards config I have

         "tls" = {
          "enable" = true
          "generate" = false
          "secret" = {
            "name" = "${each.key}-opensearch-dashboards-cert"
          }
        }
partymaker-py commented 12 months ago

@rhysxevans Thank you a lot!

daviian commented 4 months ago

I think this feature would make it so much easier to handle cross-cluster replication.

As far as I understand it would be sufficient if the operator would allow to add the CA from another cluster to the trusted CAs of OpenSearch, e.g. concatenate the CA certificates into a PEM cert file to use in plugins.security.ssl.transport.pemtrustedcas_filepath

Additionally it would be required that you can configure the nodesDn even though the cert generation is enabled, which cannot be done as of now. I would imagine a combination of the generated nodesDn and an additional list that can be specified in the OpenSearchCluster resource definition.

With these two features it should be possible to use the autogeneration of the certificates per cluster, and additionally configure other clusters to trust the generated CA including the nodes certificate by additionally configure the nodesDn accordingly.