headlamp-k8s / headlamp

A Kubernetes web UI that is fully-featured, user-friendly and extensible
https://headlamp.dev
Apache License 2.0
2.3k stars 163 forks source link

Headlamp on OKD - home and headlamp dir not writable #2342

Open kingdonb opened 1 month ago

kingdonb commented 1 month ago

Hello, I am testing Headlamp plugins on an OKD4 cluster and I'm having some difficulty.

The first thing I had to solve was, since this is a multi-tenant environment and I am but a tenant (project) I had to update my values.yaml so that the installation would not attempt to provision a ClusterRoleBinding

Then I noticed that the pod was failing to schedule because of the fixed uid/gid 100/101

I updated the securityContext so that the fixed uid/gid would not be used, but now on startup, I see this error in the headlamp container:

% k logs -f headlamp-7989c66f4b-4lc8v
Defaulted container "headlamp" out of: headlamp, headlamp-plugins (init)
{"level":"error","source":"/headlamp/backend/pkg/config/config.go","line":199,"error":"mkdir /.config: permission denied","time":"2024-09-18T15:43:09Z","message":"creating plugins directory"}
{"level":"info","source":"/headlamp/backend/cmd/headlamp.go","line":311,"time":"2024-09-18T15:43:09Z","message":"Creating Headlamp handler"}
{"level":"info","source":"/headlamp/backend/cmd/headlamp.go","line":312,"time":"2024-09-18T15:43:09Z","message":"Kubeconfig path: "}
{"level":"info","source":"/headlamp/backend/cmd/headlamp.go","line":313,"time":"2024-09-18T15:43:09Z","message":"Static plugin dir: /headlamp/static-plugins"}
{"level":"info","source":"/headlamp/backend/cmd/headlamp.go","line":314,"time":"2024-09-18T15:43:09Z","message":"Plugins dir: /build/plugins"}
{"level":"info","source":"/headlamp/backend/cmd/headlamp.go","line":315,"time":"2024-09-18T15:43:09Z","message":"Dynamic clusters support: false"}
{"level":"info","source":"/headlamp/backend/cmd/headlamp.go","line":316,"time":"2024-09-18T15:43:09Z","message":"Helm support: false"}
{"level":"info","source":"/headlamp/backend/cmd/headlamp.go","line":317,"time":"2024-09-18T15:43:09Z","message":"Proxy URLs: []"}
{"level":"info","pluginPath":"/build/plugins/lost+found/main.js","source":"/headlamp/backend/pkg/plugins/plugins.go","line":131,"error":"stat /build/plugins/lost+found/main.js: no such file or directory","time":"2024-09-18T15:43:09Z","message":"Not including plugin path, main.js not found"}
{"level":"info","context":"main","clusterURL":"https://172.30.0.1:443","source":"/headlamp/backend/pkg/kubeconfig/kubeconfig.go","line":201,"time":"2024-09-18T15:43:09Z","message":"Proxy setup"}
{"level":"error","source":"/headlamp/backend/cmd/headlamp.go","line":147,"error":"open /headlamp/frontend/index.baseUrl.html: permission denied","time":"2024-09-18T15:43:09Z","message":"writing file"}

Is there anyone who has attempted this before who can maybe give a clue what configuration change is needed?

The config image I am using is: ghcr.io/headlamp-k8s/headlamp:v0.25.1 with Flux configs from here:

There is a plugins image I have built at docker.io/kingdonb/plugins:canary which is included in the helmrelease as well.

Running the container locally, it looks like it's missing a HOME=/home/headlamp/ that would normally be set, I guess it's not set for OpenShift. But if I set that variable, I am still getting a permission denied error.

This might not be the most useful deployment of Headlamp but I'd like to understand if a fix is possible, if anyone understands the problem better than I do!

From a debug container running locally, I can see the permissions are restrictive for the headlamp user

/ $ ls -ld /home/headlamp
drwxr-sr-x    1 headlamp headlamp        24 Sep 18 16:34 /home/headlamp

I am sure I could build a custom image with a specific uid/gid in the allowed range and adopt that uid/gid, or set less restrictive permissions, but I don't have any idea if either of those are the right solution, have very little experience with OpenShift. Is setting a different fixed uid the right idea?

kingdonb commented 1 month ago

It looks like (from a quick reading of some relevant OKD4 docs) the solution is actually to make the directory group-owned by root, and make the directory writeable to root - since the randomized user is always in the root group - I made this little Dockerfile to extend the existing Headlamp image:

FROM ghcr.io/headlamp-k8s/headlamp:v0.25.1

USER root
RUN chown -R headlamp:root /home/headlamp
RUN chown -R headlamp:root /headlamp
RUN chmod -R g+w /home/headlamp /headlamp
USER headlamp

Together with manually creating a role binding that has the appropriate permissions:

% k create rolebinding headlamp --clusterrole=admin --serviceaccount=test-yebyen:headlamp
rolebinding.rbac.authorization.k8s.io/headlamp created

and it seems to have done the trick 👍 it's starting now, bypasses all of the errors that caused CrashLoopBackOff before.

It seems to be that it isn't usable though, because the namespaced rolebinding to the clusterrole admin does not have the permissions required. I don't see any values that would enable a single-namespace installation, so I might be out of luck in terms of actually using this for any meaningful purpose. I'd like to be able to tell headlamp to only attempt to list pods and resources that are within the namespaced (non-cluster) scope.

Is this a use case that headlamp might support in the future?

kingdonb commented 1 month ago

Cross-refs:

joaquimrocha commented 1 month ago

It seems to be that it isn't usable though, because the namespaced rolebinding to the clusterrole admin does not have the permissions required. I don't see any values that would enable a single-namespace installation, so I might be out of luck in terms of actually using this for any meaningful purpose. I'd like to be able to tell headlamp to only attempt to list pods and resources that are within the namespaced (non-cluster) scope.

Is this a use case that headlamp might support in the future?

Hi @kingdonb . Sorry for the delay in the reply. Assuming I understood your comment correctly, I think you want to set up a specific namespace and have Headlamp only use that for e.g. try to list pods instead of trying to list pods from all namespaces. This is supported by what we call "allowed namespaces" and the user has to set that up for the cluster, under the cluster settings. Maybe we should make this feature configurable in a different way somehow. Of course a plugin can set that up. See storeClusterSettings, though I don't think we have this function exposed to plugins (you can still set the local storage directly ATM).

Do you have a suggestion on setting up allowed namespaces in a more streamlined way?

kingdonb commented 1 month ago

Ah, I generally look for stuff like this in helm values.yaml - I looked for allowed namespaces and cluster settings in the docs and I didn't find any references to them. Now that you pointed this out, I found the gear icon in the upper right-hand corner:

Screenshot 2024-09-25 at 9 35 41 AM

This takes you to Cluster Settings, where you can set the options you mentioned, and then things are springing to life in the other tabs. 🎉

I see from there you can follow the breadcrumb to General Settings (or you can access it from the Profile icon) and then there is a place to go for Plugin Settings:

Screenshot 2024-09-25 at 9 40 38 AM

... where the Flux plugin that I'm working on testing has room for its own settings page:

Screenshot 2024-09-25 at 9 41 25 AM

So, this plugin which is still in development does not respond to the allowed namespaces, but that's up to the plugin (in headlamp-k8s/plugins#75) to resolve - it probably doesn't need any setting? Maybe one checkbox for "assume Flux is installed" – if there is a detection that would read the Flux resources (controllers, crds) for the label to tell what version is installed, in case the plugin supports multiple versions of Flux with different code paths, you could select a version...

(The challenge is that namespaced admin user cannot list namespaces or CRDs at all, so the Flux plugin can't detect the CRDs, so it does not know that Flux is not installed. I guess there is some version detection logic, but there might not be, the last time I was building a Flux UI plugin it only really promised to support the latest version of Flux at any given time!)

Screenshot 2024-09-25 at 9 43 18 AM

But none of that appears to be an issue with Headlamp itself! Looks like it works great, only documentation issue.

I can see all of the things I have access to see, anything I would have expected to work, works, inside of my service account's one authorized namespace:

Screenshot 2024-09-25 at 9 45 56 AM

Thanks! My one suggestion here then is to expose the setting in values.yaml, if possible. I see now looking for settings in the docs that the Settings page does get a mention, for Plugin Settings, in the developer docs, and another mention in the API docs, but that gears icon isn't documented anywhere; "allowed namespaces" and "default namespace" aren't searchable.

So that was easy to overlook, even with the best of intentions. But now that I see it I can't un-see it 🙈

I'm still familiarizing myself with the project; maybe I can help with a PR to the docs (or by reviewing to make sure it's clear)

joaquimrocha commented 1 month ago

@ashu8912 , the allowed-namespaces should work transparently for the useGet/useList functions. So we should look why the Flux plugin is not yielding results when there are no permissions to list namespaces.

Thanks! My one suggestion here then is to expose the setting in values.yaml, if possible. I see now looking for settings in the docs that the Settings page does get a mention, for Plugin Settings, in the developer docs, and another mention in the API docs, but that gears icon isn't documented anywhere; "allowed namespaces" and "default namespace" aren't searchable.

Definitely. We need to improve this and other settings. I think the allowed-namespaces setting does deserve to be added and highlighted in the settings.

So that was easy to overlook, even with the best of intentions. But now that I see it I can't un-see it 🙈 I'm still familiarizing myself with the project; maybe I can help with a PR to the docs (or by reviewing to make sure it's clear)

That's totally fine. I am glad you didn't just give up on Headlamp assuming we didn't support allowed-namespaces. If you find more situations like that, keep those filing those issues. And it'd be great to get some help with the docs as you suggest.

kingdonb commented 1 month ago

Going back to the top of the issue, maybe there are some enhancements possible that would make all of this work without building a custom Headlamp image. I have no idea how to implement either one, but a values.yaml setting to configure the deployment for OpenShift, and another one to configure it for single-namespace mode, would have helped make this much more streamlined.

I don't know if there's a mechanism to inject global settings via command-line parameters, that would enable the helm chart to do something about it? Ideally we don't have one headlamp installation per namespace (but again, I don't know how to solve that.) Not sure if the settings are truly global for the cluster or if the installation maintains one set of global settings in its PV, or if there are actually settings per service-account / per user

In most of the tenant environments I've worked on, dev users aren't really restricted to a single tenant, they get to manage all of their tenants, and they wouldn't use a service account from within the tenant as I have done, they would more likely use a service account from the dev team, or use an OIDC identity with role bindings to access all of their tenants.

The structure of allowed namespaces should support that well, I'll keep trying 👍 now that I know tenants are supported, I can start building my next OIDC environment for multi-tenancy on a vcluster again, and try headlamp in there with the less difficult VCluster environment (that hopefully won't have all of these particular OKD issues, might be a better fit for this!)

joaquimrocha commented 1 month ago

cc/ @illume just food for thought.