Closed urbaman closed 1 month ago
Hi. Is this a new installation, or an existing installation? Are you using the tempo
chart or the tempo-distributed
chart? Based on the mkdir
failure, I'm guessing this is a new installation.
Hi, this is a new installation, using the chart in the microk8s addon:
helm upgrade --install tempo tempo --repo https://grafana.github.io/helm-charts
Thanks
I don't have a microk8s environment, but was able to run the tempo
chart without permission issues by default, but that is because there is no volume mounted at that location, and that is because the /var/tempo
is created inside the container to have the correct ownership.
In order to help, I'd like to know more about the ceph rdb pool. When mounted, what permissions does it have? Do you have a way to configure the permissions of the volume at mount time? May I see the k8s resources associated with this mount, and what changes to the staetfulset have been made to mount it.
Hi,
The ceph pool is a standard pool backed by Proxmox VE, just using the standard admin permissions (just followed the rook installation for external cluster without messing with ceph permissions). Didn't ever had problems on that pool, not for VM/CT block storage, not for any previous K8s PV/PVC for any project backed by that pool.
Tried a new project: kubeadm 1.30.3 (not microk8s), Ubuntu 22.04.4 and tempo with last helm chart, just set persistence = true and storageClass =
Same error:
kubectl log -n tempo tempo-0
error: unknown command "log" for "kubectl"
Did you mean this?
top
logs
ubuntu@k8cp1:~$ kubectl logs -n tempo tempo-0
level=info ts=2024-08-07T07:20:01.404122887Z caller=main.go:225 msg="initialising OpenTracing tracer"
level=info ts=2024-08-07T07:20:01.425947758Z caller=main.go:118 msg="Starting Tempo" version="(version=2.5.0, branch=HEAD, revision=46dad3411)"
level=info ts=2024-08-07T07:20:01.428727249Z caller=server.go:240 msg="server listening on addresses" http=[::]:3100 grpc=[::]:9095
level=error ts=2024-08-07T07:20:01.429387657Z caller=main.go:121 msg="error running Tempo" err="failed to init module services: error initialising module: store: failed to create store: mkdir /var/tempo/traces: permission denied"
Also tried on the same cluster (kubeadm 1.30.3), Ubuntu 22.04.4 with an NFS-backed PV (changing the storageClass in the values.yaml file), same problem.
All of the pod's settings and starting steps seem to be ok untill it gets to create that directory.
kubectl describe pod -n tempo tempo-0
Name: tempo-0
Namespace: tempo
Priority: 0
Service Account: tempo
Node: k8w2/10.0.50.82
Start Time: Wed, 07 Aug 2024 09:30:14 +0200
Labels: app.kubernetes.io/instance=tempo
app.kubernetes.io/name=tempo
apps.kubernetes.io/pod-index=0
controller-revision-hash=tempo-57b55bf696
statefulset.kubernetes.io/pod-name=tempo-0
Annotations: checksum/config: 683a5438d93c3be8c53ffd1135e29472ea33c39eb4b38ca537d49a8ac109643c
cni.projectcalico.org/containerID: 5ffc8cbe0a0ba3f4874d2d5c8a77aa2b1d73a22a9d95954163610f242231aa6f
cni.projectcalico.org/podIP: 10.10.155.16/32
cni.projectcalico.org/podIPs: 10.10.155.16/32
Status: Running
IP: 10.10.155.16
IPs:
IP: 10.10.155.16
Controlled By: StatefulSet/tempo
Containers:
tempo:
Container ID: containerd://aab7d84d5802af35ef7754ee9a4f52aafc858a38bf26927979bd66ff0b90021c
Image: grafana/tempo:2.5.0
Image ID: docker.io/grafana/tempo@sha256:f0200a9bff6d14eb3a4332194f7b77c37ee1a3535e7e41db024d95aab6f1b4e8
Ports: 3100/TCP, 6831/UDP, 6832/UDP, 14268/TCP, 14250/TCP, 9411/TCP, 55680/TCP, 4317/TCP, 55681/TCP, 4318/TCP, 55678/TCP
Host Ports: 0/TCP, 0/UDP, 0/UDP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
Args:
-config.file=/conf/tempo.yaml
-mem-ballast-size-mbs=1024
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Wed, 07 Aug 2024 09:33:07 +0200
Finished: Wed, 07 Aug 2024 09:33:07 +0200
Ready: False
Restart Count: 5
Environment: <none>
Mounts:
/conf from tempo-conf (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cd8j7 (ro)
/var/tempo from storage (rw)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: storage-tempo-0
ReadOnly: false
tempo-conf:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: tempo
Optional: false
kube-api-access-cd8j7:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m21s default-scheduler Successfully assigned tempo/tempo-0 to k8w2
Normal SuccessfulAttachVolume 4m20s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-24f4a357-47ea-4a76-b215-33592769fd05"
Normal Pulled 2m49s (x5 over 4m13s) kubelet Container image "grafana/tempo:2.5.0" already present on machine
Normal Created 2m49s (x5 over 4m13s) kubelet Created container tempo
Normal Started 2m49s (x5 over 4m13s) kubelet Started container tempo
Warning BackOff 2m23s (x10 over 4m11s) kubelet Back-off restarting failed container tempo in pod tempo-0_tempo(fcd11466-2d7a-4e14-b0b5-ec1e111353e2)
Are all of my storage classes problematic?
It obviously works seamlessly without persistence enabled.
I'm not inclined to think its a storageclass issue at this point. I think we need to figure out if when you create the PVC, that you can set the POSIX permissions for it. The chart supports a tempo.securityContext
that can be used to adjust the statefulset.
I would start with this.
securityContext:
fsGroup: 10001
If that doesn't work, then we can be more agressive with something like this.
securityContext:
fsGroup: 10001
runAsGroup: 10001
runAsNonRoot: true
runAsUser: 10001
This should be enough to modify the POSIX permissions of the /var/tempo
mount that you have added.
Hi @zalegrala
The
securityContext:
fsGroup: 10001
fix already worked.
Could it be helpful to add a note to the persistence section of the chart? Is this the right place to ask/suggest it?
Thank you very much, I think this could be closed?
Thanks for the confirmation.
Describe the bug The tempo-0 pod does not start, with the following error:
error initialising module: store: failed to create store: mkdir /var/tempo/traces: permission denied
To Reproduce Steps to reproduce the behavior:
Expected behavior The pod should start with /var/tempo/traces and /wal properly working
Environment: