GoogleCloudPlatform / flink-on-k8s-operator

[DEPRECATED] Kubernetes operator for managing the lifecycle of Apache Flink and Beam applications.
Apache License 2.0
658 stars 266 forks source link

Unable to deploy flink session cluster with the default certficate in kubernetes version 1.19 #397

Open vinaykw opened 3 years ago

vinaykw commented 3 years ago

We are deploying flink session cluster on kubernetes version 1.19 . We have deployed flink operator successfully. But when we are applying helm chart for flink session cluster we are getting the error

Error: Internal error occurred: failed calling webhook "mflinkcluster.flinkoperator.k8s.io": Post "https://flink-operator-webhook-service.flink-operator-system.svc:443/mutate-flinkoperator-k8s-io-v1beta1-flinkcluster?timeout=30s": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0

This error goes away when we add a below environment variable to the kubeapiserver.yaml file

env:

Any better solution to this problem?

kbrewk commented 3 years ago

As I understand it, the go libraries used by K8s 1.19 are now requiring that the certs include the SANS section for listing domain names instead of having the domain in the CN field.

I was able to get around this issue by editing the generate-cert.yaml file in the helm chart and adding these parameters to the key gen command: -extensions v3_req -extfile "${tmpdir}/csr.conf

This made it so the cert file had the SANS section and the error went away for me.

So, I changed the following lines in the generate-cert.yaml file from this:

openssl req -new -key ${tmpdir}/server-key.pem -subj "/CN=${service}.${namespace}.svc" -config ${tmpdir}/csr.conf \ | openssl x509 -days 3650 -req -CA ca.crt -CAkey ca.key -CAcreateserial -out ${tmpdir}/server-cert.pem

to this:

openssl req -new -key ${tmpdir}/server-key.pem -subj "/CN=${service}.${namespace}.svc" -config ${tmpdir}/csr.conf \ | openssl x509 -days 3650 -req -CA ca.crt -CAkey ca.key -CAcreateserial -extensions v3_req -extfile "${tmpdir}/csr.conf" -out ${tmpdir}/server-cert.pem

and then re-deployed the Flink Operator.

pashtet04 commented 3 years ago

@kbrewk it didn't help me openssl req -nodes -new -x509 -keyout ca.key -out ca.crt \ -subj "/CN=Admission Controller Webhook CA" openssl genrsa -out ${tmpdir}/server-key.pem 2048 openssl req -new -key ${tmpdir}/server-key.pem \ -subj "/CN=${service}.${namespace}.svc" \ -config ${tmpdir}/csr.conf \ | openssl x509 -extfile ${tmpdir}/csr.conf \ -extensions v3_req \ -days 3650 -req -CA ca.crt -CAkey