Open bmaguireibm opened 11 months ago
Hi @bmaguireibm Following the stacktrace in your error log it looks like the operator cannot decode the PEM data of the CA certificate. Can you please verify if the provided cert is actually valid (for example by checking with openssl)?
You can also use the following small go program to read the cert the same way the operator does:
package main
import (
"crypto/x509"
"encoding/pem"
"fmt"
"os"
)
func main() {
data, err := os.ReadFile("ca.crt")
if err != nil {
fmt.Printf("Could not open file: %s\n", err)
return
}
block, _ := pem.Decode(data)
if block == nil {
fmt.Printf("Could not decode as PEM data\n")
return
}
caCert, err := x509.ParseCertificate(block.Bytes)
if err != nil {
fmt.Printf("Could not parse certificate: %s\n", err)
return
}
fmt.Printf("Certificate has subject '%s'\n", caCert.Subject)
}
Just run it with go run main.go
(assuming you placed the code in main.go
and the cert from the secret in ca.crt
).
Regardless of this, even in the case of invalid data the operator should not crash but should provide a proper error message and continue, so this is a bug either way.
I've got the same error as @bmaguireibm. The output from the go
program shows the following:
Certificate has subject 'CN=os,OU=OS,O=OS,L=OS,ST=OS,C=NL,1.2.840.113549.1.9.1=#13026f73'
The certificate looks to be valid
it dies provide a proper message just before it crashes
kubectl -n opensearch logs -f opensearch-operator-controller-manager-6bd4fcb57f-9znlk operator-controller-manager | grep '^{'|jq 'select(.level=="error")'
{
"level": "error",
"ts": "2024-09-13T09:32:00.027Z",
"msg": "Failed to create admin certificate",
"controller": "opensearchcluster",
"controllerGroup": "opensearch.opster.io",
"controllerKind": "OpenSearchCluster",
"OpenSearchCluster": {
"name": "os-noprd",
"namespace": "opensearch"
},
"namespace": "opensearch",
"name": "os-noprd",
"reconcileID": "ec29fbb0-e2be-410e-a1cd-4ee45cc457cb",
"interface": "transport",
"error": "tls: failed to find any PEM data in key input",
"stacktrace": "github.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*TLSReconciler).createAdminSecret\n\t/workspace/pkg/reconcilers/tls.go:224\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*TLSReconciler).handleAdminCertificate\n\t/workspace/pkg/reconcilers/tls.go:116\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/pkg/reconcilers.(*TLSReconciler).Reconcile\n\t/workspace/pkg/reconcilers/tls.go:77\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).reconcilePhaseRunning\n\t/workspace/controllers/opensearchController.go:328\ngithub.com/Opster/opensearch-k8s-operator/opensearch-operator/controllers.(*OpenSearchClusterReconciler).Reconcile\n\t/workspace/controllers/opensearchController.go:143\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:118\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:314\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:265\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.15.0/pkg/internal/controller/controller.go:226"
}
I didn't look at the code but seems like admin cert depends on http's CA somehow? edit: from the transport tls of the docs: "If you provide your own node certificates you must also provide an admin cert that the operator can use for managing the cluster:". Should it be in the http tls section? Is the admin cert used to communicate to http or to transport?
edit2: from the same log: "admin cert is not signed by CA, recreating"
Hi @gfdsa. Which CA the admin cert must be signed by (or is created from) depends on the opensearch version: For 2.x the http CA is used, for 1.x the transport CA. This relates to a change in opensearch where admin interaction (e.g. to update the securityconfig) is handled via the https port in 2.x and no longer via the transport port.
Ok, so the documentation is lagging behind the changes. I've got my cluster running creating the admin cert from our CA last week but had to remove all the secrets with certs to make it go smoothly
Ok, so the documentation is lagging behind the changes. I've got my cluster running creating the admin cert from our CA last week but had to remove all the secrets with certs to make it go smoothly
I've never tried a situation with old existing certs and a new custom CA, so very possible that the operator could not completely handle that. Not really one of the core usecases.
Hi, thanks for the great operator. I believe I've hit a bug when trying to provide my own certificates for the external http api. Below are the details of the error, any help is greatly appreciated.
Kubernetes version: v1.26.6 opensearch-operator version: 2.4.0 platform: AKS
Expected behaviour: I was trying to provide a TSL certificate for the HTTP API. The secret is generated by vault secret operator, but ultimately this produces a Kubernetes tls secret in PEM format with tls.key, tls.crt. I also provide a separate secret for ca.crt. Both secrets are generated in the same namespace and appear to be valid PEM formatted certs with the correct keys. I expect the cluster to be created using this cert for it's http api at 9200.
Actual behaviour: The operator-controller-manager goes into crash loop back off with the following error in the logs.
My cluster config is as follows: