Open Arau opened 2 years ago
Yeah I noticed that too while going through the ETCD Controller code, the SAN is set to be the secret name here: https://github.com/storageos/etcd-cluster-operator/blob/main/controllers/etcdcluster_controller.go#L240
Which I found odd, but my cluster worked fine even with the SAN looking wrong.
It should be something along the lines of:
fmt.Sprintf("*.%s.%s", cluster.Name, cluster.Namespace)
Hey @Arau -
I can see the symptoms you describe (log lines seem to lead to this code) and I can confirm that the certificate SAN is storageos-etcd-secret
for the client certificate, but this doesn't seem to impact functionality of the Ondat cluster. It works regardless, as @aeroniero33 notes.
I'm curious about the Ondat cluster not being able to connect - do you see any other log lines, perhaps in the API manager or scheduler? Are any pods NotReady?
Hi,
I executed the installation of charts with the umbrella and I see the issue as the Ondat pods can't connect to Etcd. Etcd logs indicating "error":"remote error: tls: bad certificate".
. The node pods cannot start at all.
Then I executed the installation with the etcd chart first and then the ondat-operator. The result is the same. I fixed in the cluster by copying the contents of the secret storageos-etcd-client
into the storageos-etcd-secret
while keeping the file names in the storageos-etcd-secret
as expected by the CP. The secret storageos-etcd-client
has got the right alternative names.
After that and a restart of the node pods, then the cluster started successfully.
In my values I put
kvBackend:
address: 'https://storageos-etcd.storageos-etcd:2379'
I'm thinking if it is possible that the tests didn't have the https://
prefix.
I've been using etcd without the https://
prefix and it's been working fine for me!
Another thought - did you uninstall/reinstall on the same cluster? I've noticed that the storageos
namespace with associated storageos-etcd-secret
is not necessarily deleted when helm uninstall
is run. If that secret persists through an uninstall-reinstall, or the associated pods on the etcd or storageos side do, there'll be a mismatch as:
We've pushed a new version, please re-test and let me know whether that resolves the issue!
@Arau could you please review this issue? Thank you :)
The Ondat cluster can't connect to etcd due to a
This is happening because the certificate in the
storageos-etcd-secret
has the following SAN definitionThe DNS field
storageos-etcd-secret
should match the DNS name:*.storageos-etcd.storageos-etcd