replicatedhq / kots

KOTS provides the framework, tools and integrations that enable the delivery and management of 3rd-party Kubernetes applications, a.k.a. Kubernetes Off-The-Shelf (KOTS) Software.
https://kots.io
Apache License 2.0
901 stars 90 forks source link

kots CLI: "kubectl kots pull" creates invalid k8s objects #856

Open dexhorthy opened 4 years ago

dexhorthy commented 4 years ago

tl;dr kubectl kots pull seems pretty broken right now. This issue documents the problems and some manual workarounds to get around them.


I tried a kubectl kots pull using the published sentry example license and was unable to apply the resulting yaml. I've tried this with a few apps and it should be pretty easy to reproduce. There are a few issues here:

There's one thing that I think is maybe an enhancement opportunity rather than a bug, but at the end of the deploy, kotsadm still wants you to upload a license, config, preflight checks, etc.

Repro steps

$ kubectl kots pull sentry-pro --license-file ~/go/src/github.com/replicatedhq/kots-sentry/KOTS-license-example-sentry-pro.yaml

Enter a new password to be used for the Admin Console: ••••••••
  • Pulling upstream ✓  
  • Creating base ✓  
  • Creating midstream ✓  

    Kubernetes application files created in /Users/dex/sentry-enterprise

    To deploy, run kubectl apply -k /Users/dex/sentry-enterprise/overlays/midstream

But running that gives an error of The ConfigMap "kotsadm-bundle-0" is invalid: metadata.annotations: Too long: must have at most 262144 characters

$  kubectl apply --namespace sentry-pro -k /Users/dex/sentry-enterprise/overlays/midstream
I0725 10:41:54.540984   73897 log.go:172] nil value at `volumes.configMap.name` ignored in mutation attempt
I0725 10:41:54.541068   73897 log.go:172] nil value at `volumes.projected.sources.configMap.name` ignored in mutation attempt
I0725 10:41:54.541090   73897 log.go:172] nil value at `volumes.secret.secretName` ignored in mutation attempt
I0725 10:41:54.541178   73897 log.go:172] nil value at `volumes.projected.sources.secret.name` ignored in mutation attempt
I0725 10:41:54.541213   73897 log.go:172] nil value at `volumes.persistentVolumeClaim.claimName` ignored in mutation attempt
serviceaccount/kotsadm-api configured
serviceaccount/kotsadm-operator configured
serviceaccount/kotsadm configured
role.rbac.authorization.k8s.io/kotsadm-api-role configured
role.rbac.authorization.k8s.io/kotsadm-operator-role configured
clusterrole.rbac.authorization.k8s.io/kotsadm-role configured
rolebinding.rbac.authorization.k8s.io/kotsadm-api-rolebinding configured
rolebinding.rbac.authorization.k8s.io/kotsadm-operator-rolebinding configured
clusterrolebinding.rbac.authorization.k8s.io/kotsadm-rolebinding configured
configmap/sentry unchanged
secret/kotsadm-cluster-token configured
secret/kotsadm-encryption configured
secret/kotsadm-minio configured
secret/kotsadm-password configured
secret/kotsadm-postgres configured
secret/kotsadm-session configured
secret/sentry-postgresql configured
secret/sentry-redis unchanged
secret/sentry configured
service/kotsadm-api-node configured
service/kotsadm-minio configured
service/kotsadm-postgres configured
service/kotsadm configured
service/sentry-postgresql unchanged
service/sentry-redis-master unchanged
service/sentry-redis-slave unchanged
service/sentry unchanged
deployment.apps/kotsadm-api configured
deployment.apps/kotsadm-operator configured
deployment.apps/kotsadm configured
deployment.apps/sentry-cron configured
deployment.apps/sentry-postgresql configured
deployment.apps/sentry-redis-slave unchanged
deployment.apps/sentry-web configured
deployment.apps/sentry-worker configured
statefulset.apps/kotsadm-minio configured
statefulset.apps/kotsadm-postgres configured
statefulset.apps/sentry-redis-master configured
job.batch/sentry-db-init configured
job.batch/sentry-user-create configured
ingress.extensions/sentry-ingress unchanged
persistentvolumeclaim/sentry-postgresql unchanged
persistentvolumeclaim/sentry unchanged
pod/kotsadm-migrations-1595691679 created
The ConfigMap "kotsadm-bundle-0" is invalid: metadata.annotations: Too long: must have at most 262144 characters

Workaround step 1: removing config map

It seems this can be worked around by commenting the config map out of base/kustomization.yaml, but I am unclear as to whether this will break anything

# ~/sentry-enterprise/base/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- admin-console/api-deployment.yaml
- admin-console/api-role.yaml
- admin-console/api-rolebinding.yaml
- admin-console/api-service.yaml
- admin-console/api-serviceaccount.yaml
# - admin-console/kotsadm-bundle-0.yaml
- admin-console/kotsadm-deployment.yaml
- admin-console/kotsadm-role.yaml
# ... etc etc

Workaround step 2: overriding default namespace in kustomize

After removing the config map and doing another apply, we get a whole bunch of issues with hardcoded namespaces:

the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
$ kubectl create namespace sentry-pro
namespace/sentry-pro created
$ kubectl apply --namespace sentry-pro -k ~/sentry-enterprise/overlays/midstream
I0725 10:56:36.489386   80758 log.go:172] nil value at `volumes.configMap.name` ignored in mutation attempt
I0725 10:56:36.489470   80758 log.go:172] nil value at `volumes.projected.sources.configMap.name` ignored in mutation attempt
I0725 10:56:36.489503   80758 log.go:172] nil value at `volumes.secret.secretName` ignored in mutation attempt
I0725 10:56:36.489630   80758 log.go:172] nil value at `volumes.projected.sources.secret.name` ignored in mutation attempt
I0725 10:56:36.489669   80758 log.go:172] nil value at `volumes.persistentVolumeClaim.claimName` ignored in mutation attempt
clusterrole.rbac.authorization.k8s.io/kotsadm-role configured
clusterrolebinding.rbac.authorization.k8s.io/kotsadm-rolebinding configured
configmap/sentry created
secret/sentry-postgresql created
secret/sentry-redis created
secret/sentry created
service/sentry-postgresql created
service/sentry-redis-master created
service/sentry-redis-slave created
service/sentry created
deployment.apps/sentry-cron created
deployment.apps/sentry-postgresql created
deployment.apps/sentry-redis-slave created
deployment.apps/sentry-web created
deployment.apps/sentry-worker created
statefulset.apps/sentry-redis-master created
job.batch/sentry-db-init created
job.batch/sentry-user-create created
ingress.extensions/sentry-ingress created
persistentvolumeclaim/sentry-postgresql created
persistentvolumeclaim/sentry created
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.
the namespace from the provided object "default" does not match the namespace "sentry-pro". You must pass '--namespace=default' to perform this operation.

So let's update base/kustomization.yaml with our namespace to see if that fixes it:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: sentry-pro # added
resources:
- admin-console/api-deployment.yaml
- admin-console/api-role.yaml
- admin-console/api-rolebinding.yaml
- admin-console/api-service.yaml
- admin-console/api-serviceaccount.yaml
# - admin-console/kotsadm-bundle-0.yaml
- admin-console/kotsadm-deployment.yaml
- admin-console/kotsadm-role.yaml
 # etc etc etc

At the end of this, the apply works. I could probably have also done this namespace tweak in a downstream, so you could argue this falls on the end user, but I'd say it's better for things to work out of the box, which I think we could do by following our own advice and omitting namespace on all the admin-console resources.

Unfortunately this still leaves kotsadm-api in a crash loop:

$ kubectl logs -f kotsadm-api-7d48cdcb6d-r42xx -n sentry-pro
[2020-07-25T15:59:57.357] [INFO ] [TSED] - Call hook $beforeInit
[2020-07-25T15:59:57.360] [INFO ] [TSED] - Call hook $onInit
[2020-07-25T15:59:57.361] [INFO ] [TSED] - Build providers
[2020-07-25T15:59:57.384] [INFO ] [TSED] - Call hook $afterInit
[2020-07-25T15:59:57.387] [INFO ] [TSED] - Call hook $onMountingMiddlewares
INFO  [ 2020-07-25T15:59:57.407Z] (kotsadm-api/1 on kotsadm-api-7d48cdcb6d-r42xx): ensuring a local cluster exists
ERROR [ 2020-07-25T15:59:57.409Z] (kotsadm-api/1 on kotsadm-api-7d48cdcb6d-r42xx): you must set AUTO_CREATE_CLUSTER_TOKEN

Workaround step 3: adding a cluster token via downstream

Let's make a downstream that patches in a cluster token:

mkdir ~/sentry-enterprise/overlays/us-east-1   
cat <<EOF > ~/sentry-enterprise/overlays/us-east-1/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: sentry-pro
bases:
- ../midstream
patches:
- ./patch-cluster-token.yaml
EOF
export RANDOM_STRING=not-actually-random-fixme
cat <<EOF > ~/sentry-enterprise/overlays/us-east-1/patch-cluster-token.yaml
apiVersion: v1
kind: Secret
metadata:
  name: kotsadm-cluster-token
stringData:
  kotsadm-cluster-token: "${RANDOM_STRING}"
EOF

Now we should have something like this to apply:

overlays
├── midstream
│   └── kustomization.yaml
└── us-east-1
    ├── kustomization.yaml
    └── patch-cluster-token.yaml

We can verify this works with a kustomize build

$ kubectl kustomize ~/sentry-enterprise/overlays/us-east-1
# ...
---
apiVersion: v1
kind: Secret
metadata:
  creationTimestamp: null
  labels:
    kots.io/kotsadm: "true"
    velero.io/exclude-from-backup: "true"
  name: kotsadm-cluster-token
  namespace: sentry-pro
stringData:
  kotsadm-cluster-token: not-actually-random-fixme
# ...

let's delete the previous secret so we can overwrite the value (no replace -k yet)

$ kubectl delete -n sentry-pro secret kotsadm-cluster-token 
secret "kotsadm-cluster-token" deleted
$ kubectl apply -k ~/sentry-enterprise/overlays/us-east-1

Let's verify really quick that we have some data in there now

$ kubectl get secret -n sentry-pro kotsadm-cluster-token -o yaml

apiVersion: v1
data:
  kotsadm-cluster-token: bm90LWFjdHVhbGx5LXJhbmRvbS1maXhtZQ==
kind: Secret
metadata:
  annotations:

And it looks like now our kotsadm-api pod is running okay. Hopefully this will also fix the crashloop in kotsadm as it waits for the bucket to be created in minio:

[2020-07-25T16:18:40.666] [INFO ] [TSED] - Call hook $onReady
INFO  [ 2020-07-25T16:18:40.666Z] (kotsadm-api/1 on kotsadm-api-7d48cdcb6d-h5rt4): Ensuring bucket exists...
INFO  [ 2020-07-25T16:18:40.696Z] (kotsadm-api/1 on kotsadm-api-7d48cdcb6d-h5rt4): Server started...
[2020-07-25T16:18:40.696] [INFO ] [TSED] - Started in 359 ms

success!

It looks like now kotsadm is up and running, as well as our Sentry app pods:

$ kubectl get pod -n sentry-pro
NAME                                  READY   STATUS      RESTARTS   AGE
kotsadm-6f94cd7b77-8hf82              1/1     Running     8          24m
kotsadm-api-7d48cdcb6d-h5rt4          1/1     Running     5          7m24s
kotsadm-migrations-1595692343         0/1     Completed   0          24m
kotsadm-minio-0                       1/1     Running     0          24m
kotsadm-operator-59d477b795-xzmtz     1/1     Running     0          24m
kotsadm-postgres-0                    1/1     Running     0          24m
sentry-cron-576d54f477-v9g4s          1/1     Running     0          26m
sentry-postgresql-6d9fcf9f65-grh4w    1/1     Running     0          26m
# etc etc etc...

From here we can launch the admin console

kubectl kots admin-console -n sentry-pro
  • Press Ctrl+C to exit
  • Go to http://localhost:8800 to access the Admin Console
# (eventually)
  • Go to http://localhost:9000 to access the application

We still have to go through and upload the license etc, but once we've gone through the UI setup things seems to be humming along nicely and we can launch the Sentry app on localhost:9000

dexhorthy commented 4 years ago

@marccampbell @markpundsack getting some more questions on this one -- can I get an ack that y'all are aware of it?

genebean commented 4 years ago

Yeah, the admin console not respecting the namespace flag is a real pain and super confusing. Per Slack, it also seems that an environment variable of POD_NAMESPACE is needed to get some of the namespaces correct and that too is confusing.

MikaelSmith commented 4 years ago

https://github.com/replicatedhq/kots/blob/master/pkg/upstream/admin-console.go#L125 seems bad.

MikaelSmith commented 4 years ago

POD_NAMESPACE at the very least governs what the Namespace template function returns.

marccampbell commented 4 years ago

Yeah, unfortunately I think pull needs a lot of attention and refactoring.

It was originally written to support a weird workflow which we no longer actually need. It attempts to "sideload" the application so that the admin console can install.

Since this was written, we've introduced automated installs, and airgap installs to existing clusters.

IMO, kots pull should be a way to get the Admin Console manifests (and application metadata i.e. branding and kots.io/v1beta1, application needed for RBAC permissions), and generate the manifests for it. KOTS supports airgap installs and I think this will turn pull into a composable command instead of an overreaching command.

genebean commented 4 years ago

@marccampbell fwiw, I was directed to kots pull as the means of doing the initial deploy of a kots application via GitOps (Argo CD to be specific). For me, simply being able to convert to GitOps post-install isn’t quite enough.

marccampbell commented 4 years ago

@genebean understood. There are definitely some bugs in this workflow we need to address before this is viable.

genebean commented 4 years ago

I think this is likely the same issue @dexhorthy mentioned, but when I try to apply the manifests generated by this I get the following error:

The ConfigMap "kotsadm-bundle-0" is invalid: metadata.annotations: Too long: must have at most 262144 characters

gabegorelick commented 3 years ago

fwiw, I was directed to kots pull as the means of doing the initial deploy of a kots application via GitOps (Argo CD to be specific). For me, simply being able to convert to GitOps post-install isn’t quite enough.

I've worked around this limitation by kots install-ing an empty app, configuring GitOps, then switching to the real channel.

gabegorelick commented 3 years ago

The large ConfigMap appears to be fixed. kotsadm-bundle-0.yaml currently clocks in at 123 KB by my count, which is well under the 1 MB Kubernetes limit for ConfigMaps. I'm able to kubectl apply it without issue.