cert-manager / istio-csr

istio-csr is an agent that allows for Istio workload and control plane components to be secured using cert-manager.
https://cert-manager.io/docs/usage/istio-csr/
Apache License 2.0
154 stars 61 forks source link

[doc] confusion with `ca.pem` and Readiness probe failed on ingress and egress gateways #108

Open nicop311 opened 2 years ago

nicop311 commented 2 years ago

Hello all, In the README.md, I am confused by this line --from-file=ca.pem=ca.pem in the section Load root CAs from file ca.pem (Preferred).

I do not know what ca.pem file I should use and I do not know if this CA has anything to do with Cert-manager Issuer or ClusterIssuer that we have to create. It is not clear for me what ca.pem is and where it comes from. I wish I can choose ca.pem (letsencrypt or custom CA). Maybe this part can be more explained.

Right now, after creating cert-manager CA issuer, I generate a self-signed ca.pem (which has nothing to do with the cert-manager CA issuer) file with openssl, I follow the steps from Load root CAs from file ca.pem (Preferred).

But I have an error : the result is my INgress and Egress gateways have both a Readiness probe failed error and x509 certificate signed by unknown authority error.

I suspect my problem comes from the line kubectl create secret generic istio-root-ca --from-file=ca.pem=ca.pem -n cert-manager and from cert-manager CA issuer.

Version

Istio

I use the IstioOperator install from Readme.

$ istioctl version

client version: 1.11.3
control plane version: 1.11.3
data plane version: none

Kubernetes

I use Kubernetes KIND.

$ kind version
kind v0.11.1 go1.17.1 linux/amd64
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:38:50Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-21T23:01:33Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}

Some logs

Pods not ready

kubectl get pods -n istio-system
NAME                                    READY   STATUS    RESTARTS   AGE
istio-egressgateway-6f9c77d6d9-s7t9j    0/1     Running   0          7h23m
istio-ingressgateway-5f7dc9c95f-n9n6r   0/1     Running   0          7h23m
istiod-76754d847-vqz29                  1/1     Running   0          7h23m

Readiness probe failed on istio-egressgateway and istio-ingressgateway

kubectl describe pods -n istio-system
[...TRUNCATED...]
Events:
  Type     Reason     Age                     From     Message
  ----     ------     ----                    ----     -------
  Warning  Unhealthy  4m1s (x36450 over 20h)  kubelet  Readiness probe failed: Get "http://10.2.1.10:15021/healthz/ready": dial tcp 10.2.1.10:15021: connect: connection refused

More precise logs from gateway pod and container: x509 unknown

kubectl logs -n istio-system istio-egressgateway-6f9c77d6d9-s7t9j -c istio-proxy
2021-10-22T08:58:58.439779Z info    FLAG: --concurrency="0"
2021-10-22T08:58:58.440209Z info    FLAG: --domain="istio-system.svc.cluster.local"
2021-10-22T08:58:58.440357Z info    FLAG: --help="false"
2021-10-22T08:58:58.440520Z info    FLAG: --log_as_json="false"
2021-10-22T08:58:58.440662Z info    FLAG: --log_caller=""
2021-10-22T08:58:58.440823Z info    FLAG: --log_output_level="default:info"
2021-10-22T08:58:58.440881Z info    FLAG: --log_rotate=""
2021-10-22T08:58:58.441040Z info    FLAG: --log_rotate_max_age="30"
2021-10-22T08:58:58.441181Z info    FLAG: --log_rotate_max_backups="1000"
2021-10-22T08:58:58.441317Z info    FLAG: --log_rotate_max_size="104857600"
2021-10-22T08:58:58.441461Z info    FLAG: --log_stacktrace_level="default:none"
2021-10-22T08:58:58.441634Z info    FLAG: --log_target="[stdout]"
2021-10-22T08:58:58.441687Z info    FLAG: --meshConfig="./etc/istio/config/mesh"
2021-10-22T08:58:58.441865Z info    FLAG: --outlierLogPath=""
2021-10-22T08:58:58.442010Z info    FLAG: --proxyComponentLogLevel="misc:error"
2021-10-22T08:58:58.442169Z info    FLAG: --proxyLogLevel="warning"
2021-10-22T08:58:58.442338Z info    FLAG: --serviceCluster="istio-proxy"
2021-10-22T08:58:58.442537Z info    FLAG: --stsPort="0"
2021-10-22T08:58:58.442715Z info    FLAG: --templateFile=""
2021-10-22T08:58:58.442814Z info    FLAG: --tokenManagerPlugin="GoogleTokenExchange"
2021-10-22T08:58:58.442922Z info    Version 1.11.3-6bda7c161d3925c48fbea3f297ffa52461893f3b-Clean
2021-10-22T08:58:58.443456Z info    Proxy role  ips=[10.2.1.16 fe80::aaaa] type=router id=istio-egressgateway-6f9c77d6d9-s7t9j.istio-system domain=istio-system.svc.cluster.local
2021-10-22T08:58:58.443736Z info    Apply mesh config from file accessLogFile: /dev/stdout
defaultConfig:
  discoveryAddress: istiod.istio-system.svc:15012
  proxyMetadata: {}
  tracing:
    zipkin:
      address: zipkin.istio-system:9411
enablePrometheusMerge: true
rootNamespace: istio-system
trustDomain: cluster.local
2021-10-22T08:58:58.448687Z info    Effective config: binaryPath: /usr/local/bin/envoy
configPath: ./etc/istio/proxy
controlPlaneAuthPolicy: MUTUAL_TLS
discoveryAddress: istiod.istio-system.svc:15012
drainDuration: 45s
parentShutdownDuration: 60s
proxyAdminPort: 15000
proxyMetadata: {}
serviceCluster: istio-proxy
statNameLength: 189
statusPort: 15020
terminationDrainDuration: 5s
tracing:
  zipkin:
    address: zipkin.istio-system:9411

2021-10-22T08:58:58.448867Z info    JWT policy is third-party-jwt
2021-10-22T08:58:58.466682Z info    Opening status port 15020
2021-10-22T08:58:58.466590Z info    CA Endpoint cert-manager-istio-csr.cert-manager.svc:443, provider Citadel
2021-10-22T08:58:58.471631Z info    Using CA cert-manager-istio-csr.cert-manager.svc:443 cert with certs: var/run/secrets/istio/root-cert.pem
2021-10-22T08:58:58.477719Z info    citadelclient   Citadel client using custom root cert: cert-manager-istio-csr.cert-manager.svc:443
2021-10-22T08:58:58.536343Z info    ads All caches have been synced up in 101.910366ms, marking server ready
2021-10-22T08:58:58.537002Z info    sds SDS server for workload certificates started, listening on "etc/istio/proxy/SDS"
2021-10-22T08:58:58.537042Z info    xdsproxy    Initializing with upstream address "istiod.istio-system.svc:15012" and cluster "Kubernetes"
2021-10-22T08:58:58.538024Z info    sds Starting SDS grpc server
2021-10-22T08:58:58.539498Z info    Pilot SAN: [istiod.istio-system.svc]
2021-10-22T08:58:58.539595Z info    starting Http service at 127.0.0.1:15004
2021-10-22T08:58:58.544922Z info    Starting proxy agent
2021-10-22T08:58:58.544995Z info    Epoch 0 starting
2021-10-22T08:58:58.545028Z info    Envoy command: [-c etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --drain-strategy immediate --parent-shutdown-time-s 60 --local-address-ip-version v4 --bootstrap-version 3 --file-flush-interval-msec 1000 --disable-hot-restart --log-format %Y-%m-%dT%T.%fZ    %l  envoy %n    %v -l warning --component-log-level misc:error]
2021-10-22T08:58:58.740974Z warning envoy config    StreamAggregatedResources gRPC config stream closed: 14, connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"
2021-10-22T08:58:58.851695Z warning envoy config    StreamAggregatedResources gRPC config stream closed: 14, connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"
2021-10-22T08:58:59.028814Z warn    ca  ca request failed, starting attempt 1 in 102.093205ms
2021-10-22T08:58:59.131582Z warn    ca  ca request failed, starting attempt 2 in 217.620363ms
2021-10-22T08:58:59.350302Z warn    ca  ca request failed, starting attempt 3 in 413.164804ms
2021-10-22T08:58:59.763817Z warn    ca  ca request failed, starting attempt 4 in 790.034269ms
2021-10-22T08:58:59.764281Z warning envoy config    StreamAggregatedResources gRPC config stream closed: 14, connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"
2021-10-22T08:59:00.555360Z warn    sds failed to warm certificate: failed to generate workload certificate: create certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"
2021-10-22T08:59:00.946959Z warning envoy config    StreamAggregatedResources gRPC config stream closed: 14, connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"
JoshVanL commented 2 years ago

Hi @nicop311, the ca.pem file referenced should contain the root CAs that you would like your istio cluster to trust (including and likely only the CA of your issuer). If you are using the Issuer of type ca, then this would be the CA within the Secret as referenced in the Issuer config.

Propagating a different CA to that used by the Issuer will make istio workloads not trust the CA which signed their certificates.

nicop311 commented 2 years ago

Thank you @JoshVanL for your answer, your explanation is clear. But I am still confused by the documentation.

See my quetion below: "What is the relation between CA_FROM_cert-manager-CA_Issuer_istio-system from the istio-system namespace and the file ca.pem from the cert-manager namespace?

If they are the same, I don't understand the steps and the procedure explained in the documentation.


If I try to follow istio-csr documentation, here is the details that could be added to the documentation.

Step 0. Have a K8s cluster and istioctl

Step 1. Install cert-manager with OLM

  1. Go to: https://operatorhub.io/operator/cert-manager.
  2. Click Install.
  3. Install OLM (curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.19.1/install.sh | bash -s v0.19.1)
  4. Install cert-manager (kubectl create -f https://operatorhub.io/install/cert-manager.yaml)

Step 2. Create CA Issuer or ClusterIssuer

In the istio-csr documentation Issuer or ClusterIssuer, you are advise to create a Cert-manager CA Issuer in the istio-system namespace.

  1. I use istioctl operator init which creates an istio-system namespace if it does not exist.
istioctl operator init
Operator controller is already installed in istio-operator namespace.
Upgrading operator controller in namespace: istio-operator using image: docker.io/istio/operator:1.11.3
Operator controller will watch namespaces: istio-system
✔ Istio operator installed
✔ Installation complete

Now the istio-system ns exist. We can also create it by hand it does not matter.

  1. I create a cert-manager CA Issuer thanks to the suggested example:
kubectl apply -n istio-system -f https://raw.githubusercontent.com/cert-manager/istio-csr/v0.3.0/docs/example-issuer.yaml
issuer.cert-manager.io/selfsigned unchanged
certificate.cert-manager.io/istio-ca configured
issuer.cert-manager.io/istio-ca unchanged

This creates some secrets in the istio-system namespace:

kubectl get secrets -n istio-system
NAME                  TYPE                                  DATA   AGE
default-token-pwsx4   kubernetes.io/service-account-token   3      5d
istio-ca              kubernetes.io/tls                     3      5d
istiod-tls            kubernetes.io/tls                     3      5d

Now this is the result (I replace keys and cert by names/const to save space)

thedetective@k8s-kind-monitoring-target-114:~$ kubectl get secrets -n istio-system istiod-tls -o yaml > istiod-tls.yaml
# istiod-tls.yaml
apiVersion: v1
data:
  ca.crt:
  CA_FROM_cert-manager-CA_Issuer_istio-system
  tls.crt:
  istiod-tls_tls.crt
  tls.key:
  istiod-tls_tls.key
kind: Secret
metadata:
  annotations:
    cert-manager.io/alt-names: istiod.istio-system.svc
    cert-manager.io/certificate-name: istiod
    cert-manager.io/common-name: istiod.istio-system.svc
    cert-manager.io/ip-sans: ""
    cert-manager.io/issuer-group: cert-manager.io
    cert-manager.io/issuer-kind: Issuer
    cert-manager.io/issuer-name: istio-ca
    cert-manager.io/uri-sans: spiffe://cluster.local/ns/istio-system/sa/istiod-service-account
  creationTimestamp: "2021-10-22T08:56:48Z"
  name: istiod-tls
  namespace: istio-system
  resourceVersion: "2387761"
  uid: 58ee187a-f233-4f39-b9e5-00ffbc28ea58
type: kubernetes.io/tls
thedetective@k8s-kind-monitoring-target-114:~$ kubectl get secrets -n istio-system istio-ca -o yaml > istio-ca.yaml
# istio-ca.yaml
apiVersion: v1
data:
  ca.crt:
  CA_FROM_cert-manager-CA_Issuer_istio-system
  tls.crt:
  CA_FROM_cert-manager-CA_Issuer_istio-system
  tls.key:
  istio-ca-tls.key
kind: Secret
metadata:
  annotations:
    cert-manager.io/alt-names: ""
    cert-manager.io/certificate-name: istio-ca
    cert-manager.io/common-name: istio-ca
    cert-manager.io/ip-sans: ""
    cert-manager.io/issuer-group: cert-manager.io
    cert-manager.io/issuer-kind: Issuer
    cert-manager.io/issuer-name: selfsigned
    cert-manager.io/uri-sans: ""
  creationTimestamp: "2021-10-22T08:54:53Z"
  name: istio-ca
  namespace: istio-system
  resourceVersion: "114380"
  uid: efaceac5-d398-4c84-bfaf-d13635d54d8e
type: kubernetes.io/tls

Step 3: Load root CAs from file ca.pem (Preferred)

I create a ca.pem file since I need one and there are no explanation from where this CA should come.

openssl req -x509 -sha512 -nodes -extensions v3_ca -newkey rsa:4096 -keyout ca-cert-and-key.pem -days 7320 -out ca-cert-and-key.pem

I create the ca.pem from ca-cert-and-key.pem.

Then I follow the documentation:

$ helm repo add jetstack https://charts.jetstack.io
$ helm repo update
$ kubectl create namespace istio-system
$ kubectl create secret generic istio-root-ca --from-file=ca.pem=ca.pem -n cert-manager
$ helm install -n cert-manager cert-manager-istio-csr jetstack/cert-manager-istio-csr \
  --set "app.tls.rootCAFile=/var/run/secrets/istio-csr/ca.pem" \
  --set "volumeMounts[0].name=root-ca" \
  --set "volumeMounts[0].mountPath=/var/run/secrets/istio-csr" \
  --set "volumes[0].name=root-ca" \
  --set "volumes[0].secret.secretName=istio-root-ca"

Step 4: Installing Istio

I follow the documentation Installing Istio.

I put spec.meshConfig.trustDomain: cluster.local.

Question

I assume the CA CA_FROM_cert-manager-CA_Issuer_istio-system is generated by the Cert-manager CA Issuer or ClusterIssuer. This CA_FROM_cert-manager-CA_Issuer_istio-system is used in both istio-ca and istiod-tls secrets int the istio-system namespace.

QUESTION: What is the relation between CA_FROM_cert-manager-CA_Issuer_istio-system and the file ca.pem from kubectl create secret generic istio-root-ca --from-file=ca.pem=ca.pem -n cert-manager (notice this is cert-manager namespace, not istio-system) from section Load root CAs from file ca.pem (Preferred) ?

If CA_FROM_cert-manager-CA_Issuer_istio-system and --from-file=ca.pem=ca.pem are supposed to be the same, then I don't understand what is happening in the doc procedure and I don't understand why one secret is in istio-system namespace and the other is in cert-manager.

It seems really weird to me to create the ca.pem file out of the CA_FROM_cert-manager-CA_Issuer_istio-system auto generated by cert-manager.

Maybe cert-manager/istio-csr makes sense as soon as we do a certificate certificate rotation? Because for the moment, I do not see the improvement compared to the istio plug in CA certificates and the cacerts secret, if we have to create a secret by hand.

Of course if I use CA_FROM_cert-manager-CA_Issuer_istio-system to create ca.pem, my error x509: certificate signed by unknown authority disappear.

I think the README could mention this.


Disclaimer: I know that in real life situation, cert-manager would be pluged into a PKI or tool like Hashicorp Vault.