clastix / kamaji

Kamaji is the Hosted Control Plane Manager for Kubernetes.
https://kamaji.clastix.io
Apache License 2.0
994 stars 90 forks source link

Don't hard code cluster dns domain! #550

Open rossbachp opened 3 weeks ago

rossbachp commented 3 weeks ago

Please add clusterDomain: xxx.local to your values.yaml and use it to setup kamaji-etcd and the certificates definition.

Change this to certmanager_certificate.yaml

apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  labels:
    {{- $data := . | mustMergeOverwrite (dict "component" "issuer") -}}
    {{- include "kamaji.labels" $data | nindent 4 }}
  name: kamaji-selfsigned-issuer
  namespace: {{ .Release.Namespace }}
spec:
  selfSigned: {}
root@k8s-02:~/kamaji/kamaji/charts/kamaji/templates# more certmanager_certificate.yaml 
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  labels:
    {{- $data := . | mustMergeOverwrite (dict "component" "certificate") -}}
    {{- include "kamaji.labels" $data | nindent 4 }}
  name: {{ include "kamaji.certificateName" . }}
  namespace: {{ .Release.Namespace }}
spec:
  dnsNames:
    - {{ include "kamaji.webhookServiceName" . }}.{{ .Release.Namespace }}.svc
    - {{ include "kamaji.webhookServiceName" . }}.{{ .Release.Namespace }}.svc.{{ .Values.clusterDomain }}
  issuerRef:
    kind: Issuer
    name: kamaji-selfsigned-issuer
  secretName: {{ include "kamaji.webhookSecretName" . }}
prometherion commented 3 weeks ago

Thanks for the report @rossbachp, are you up to work on this?

rossbachp commented 3 weeks ago

Yes, but some more strange effect is happened. The issuer wasn't deployed and I must doit manually after generated with helm template.

helm install kamaji . --namespace kamaji-system --create-namespace --values kamaji-values.yaml Error: INSTALLATION FAILED: failed to create resource: Internal error occurred: failed calling webhook "vdatastore.kb.io": failed to call webhook: Post "https://kamaji-webhook-service.kamaji-system.svc:443/validate-kamaji-clastix-io-v1alpha1-datastore?timeout=10s": dial tcp 10.43.50.248:443: connect: operation not permitted

Also I can't removed the DataStore default:

k delete DataStore default --force Warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely. Error from server (InternalError): Internal error occurred: failed calling webhook "vdatastore.kb.io": failed to call webhook: Post "https://kamaji-webhook-service.kamaji-system.svc:443/validate-kamaji-clastix-io-v1alpha1-datastore?timeout=10s": service "kamaji-webhook-service" not found

Had you seen this before?

rossbachp commented 3 weeks ago
git diff templates/certmanager_certificate.yaml
diff --git a/charts/kamaji/templates/certmanager_certificate.yaml b/charts/kamaji/templates/certmanager_certificate.yaml
index 2f310b1..9c3300c 100644
--- a/charts/kamaji/templates/certmanager_certificate.yaml
+++ b/charts/kamaji/templates/certmanager_certificate.yaml
@@ -9,8 +9,8 @@ metadata:
 spec:
   dnsNames:
     - {{ include "kamaji.webhookServiceName" . }}.{{ .Release.Namespace }}.svc
-    - {{ include "kamaji.webhookServiceName" . }}.{{ .Release.Namespace }}.svc.cluster.local
+    - {{ include "kamaji.webhookServiceName" . }}.{{ .Release.Namespace }}.svc.{{ .Values.clusterDomain }}
   issuerRef:
     kind: Issuer
     name: kamaji-selfsigned-issuer
-  secretName: {{ include "kamaji.webhookSecretName" . }}
\ No newline at end of file
+  secretName: {{ include "kamaji.webhookSecretName" . }}
rossbachp commented 3 weeks ago
git diff values.yaml
diff --git a/charts/kamaji/values.yaml b/charts/kamaji/values.yaml
index 21b529e..e0046e4 100644
--- a/charts/kamaji/values.yaml
+++ b/charts/kamaji/values.yaml
@@ -98,6 +98,7 @@ loggingDevel:
 # -- Specify the default DataStore name for the Kamaji instance.
 defaultDatastoreName: default

+clusterDomain: cluster.local
 kamaji-etcd:
   deploy: true
   fullnameOverride: kamaji-etcd
@@ -108,4 +109,4 @@ kamaji-etcd:
 # -- Disable the analytics traces collection
 telemetry:
   disabled: false
-  
\ No newline at end of file
+  
rossbachp commented 3 weeks ago

I must install the Crds manually:

k apply -f crds
helm install kamaji . --namespace kamaji-system --create-namespace --values kamaji-values.yaml
helm template -s templates/certmanager_issuer.yaml kamaji . --namespace kamaji-system --create-namespace --values kamaji-values.yaml | kubectl -n kamaji-system apply -f -
rossbachp commented 3 weeks ago

Also must create the DataStore manually:

helm template -s charts/kamaji-etcd/templates/etcd_datastore.yaml kamaji . --namespace kamaji-system --create-namespace --values kamaji-values.yaml | kubectl apply -f -
prometherion commented 3 weeks ago

@rossbachp please, don't use the issue messages as a sort of memo or DMs, we need to decrease them as much as we can the spam and report back just relevant information.

Upon Kamaji uninstallation some chicken/egg problems arise with finalizers, validating/mutating webhooks, etc.

You can remove manually the finalizers from the objects unable to get deleted since Kamaji is not running anymore, as well as delete CRDs since Helm installs them only on the Install event (it's a well-known limitation).

Furthermore, I would suggest giving a try removing all the cluster.local suffix from the Certificate SAN, I don't remember the internal quirks of Kubernetes Dynamic Admission Controllers but resolution should start first with the third level domain name (${svc_name}.${svc_namespace}.svc) rather than the fully qualified one.

I just tested it with the following change:

diff --git a/charts/kamaji/templates/certmanager_certificate.yaml b/charts/kamaji/templates/certmanager_certificate.yaml
index 2f310b1..9f004cd 100644
--- a/charts/kamaji/templates/certmanager_certificate.yaml
+++ b/charts/kamaji/templates/certmanager_certificate.yaml
@@ -9,7 +9,6 @@ metadata:
 spec:
   dnsNames:
     - {{ include "kamaji.webhookServiceName" . }}.{{ .Release.Namespace }}.svc
-    - {{ include "kamaji.webhookServiceName" . }}.{{ .Release.Namespace }}.svc.cluster.local
   issuerRef:
     kind: Issuer
     name: kamaji-selfsigned-issuer

I've been able to install Kamaji and create TCPs with no problem: we can skip the cluster.local settings in the Helm templates without introducing a useless variable.

bsctl commented 3 weeks ago

@rossbachp just for reference, when a broken installation of Kamaji: https://kamaji.clastix.io/getting-started/#cleanup

rossbachp commented 3 weeks ago

I miss a the information how I delete the datastore manifest. Only the deletion of CRDS are explained.

prometherion commented 3 days ago

@rossbachp any update on this contribution, besides the bug report?