grafana / tempo-operator

Grafana Tempo Kubernetes operator
https://grafana.com/docs/tempo/latest/setup/operator/
GNU Affero General Public License v3.0
55 stars 27 forks source link

Update tempo to 2.5.0 #958

Closed pavolloffay closed 2 months ago

pavolloffay commented 2 months ago

The upstream tempo 2.5.0 docker images has a breaking change https://github.com/grafana/tempo/releases/tag/v2.5.0

The user changed from root to tempo (10001:10001) which reads /var/tempo.

Notable changes:

Upgrade test

k apply -f https://github.com/grafana/tempo-operator/releases/download/v0.10.0/tempo-operator.yaml

kubectl apply -f - <<EOF
apiVersion: tempo.grafana.com/v1alpha1
kind: TempoStack
metadata:
  name: simplest4
spec:
  managementState: Managed
  storage:
    secret:
      name: minio-test
      type: s3
  storageSize: 1Gi
  resources:
    total:
      limits:
        memory: 2Gi
        cpu: 2000m
  template:
    queryFrontend:
      jaegerQuery:
        enabled: true
    ingester:
      replicas: 2
EOF

kubectl apply -f - <<EOF
apiVersion: tempo.grafana.com/v1alpha1
kind: TempoMonolithic
metadata:
  name: sample
spec:
  storage:
    traces:
      backend: pv
      size: 2Gi
EOF

REPORT data

IMG_PREFIX=docker.io/pavolloffay OPERATOR_VERSION=0.11.0 TEMPO_VERSION=2.5.0 make docker-build docker-push deploy
codecov-commenter commented 2 months ago

Codecov Report

Attention: Patch coverage is 14.91228% with 97 lines in your changes missing coverage. Please review.

Project coverage is 73.35%. Comparing base (fc459a8) to head (4554d6a). Report is 3 commits behind head on main.

Files Patch % Lines
internal/upgrade/v0_11_0.go 14.91% 93 Missing and 4 partials :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #958 +/- ## ========================================== - Coverage 74.37% 73.35% -1.02% ========================================== Files 104 105 +1 Lines 6372 6486 +114 ========================================== + Hits 4739 4758 +19 - Misses 1346 1438 +92 - Partials 287 290 +3 ``` | [Flag](https://app.codecov.io/gh/grafana/tempo-operator/pull/958/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=grafana) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/grafana/tempo-operator/pull/958/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=grafana) | `73.35% <14.91%> (-1.02%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=grafana#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

andreasgerstmayr commented 2 months ago

The ownership of /var/tempo changed, see the breaking changes of the 2.5.0 release: https://github.com/grafana/tempo/releases/tag/v2.5.0

Ownership of /var/tempo is changing. Historically this has been owned by root:root, and with this change it will now be owned by tempo:tempo with the UID/GID of 10001. The ingester and metrics-generator statefulsets may need to be chown'd in order to start properly. A jsonnet example of an init container is included with the PR. This impacts impacts all users of the grafana/tempo Docker image.

I think we'll need an init container to adjust existing installations.

rubenvp8510 commented 2 months ago

The ownership of /var/tempo changed, see the breaking changes of the 2.5.0 release: https://github.com/grafana/tempo/releases/tag/v2.5.0

Ownership of /var/tempo is changing. Historically this has been owned by root:root, and with this change it will now be owned by tempo:tempo with the UID/GID of 10001. The ingester and metrics-generator statefulsets may need to be chown'd in order to start properly. A jsonnet example of an init container is included with the PR. This impacts impacts all users of the grafana/tempo Docker image.

I think we'll need an init container to adjust existing installations.

The ownership of /var/tempo changed, see the breaking changes of the 2.5.0 release: https://github.com/grafana/tempo/releases/tag/v2.5.0

Ownership of /var/tempo is changing. Historically this has been owned by root:root, and with this change it will now be owned by tempo:tempo with the UID/GID of 10001. The ingester and metrics-generator statefulsets may need to be chown'd in order to start properly. A jsonnet example of an init container is included with the PR. This impacts impacts all users of the grafana/tempo Docker image.

I think we'll need an init container to adjust existing installations.

Is this not something solved by the securityContext?

pavolloffay commented 2 months ago

good that we have the upgrade tests

an instance created with 2.4.2 and upgraded to 2.5.0 throws

level=error ts=2024-06-28T14:45:59.767604772Z caller=rate_limited_logger.go:27 msg="pusher failed to consume trace data" err="rpc error: code = Unknown desc = mkdir /var/tempo/wal/d3ff7b03-3e2e-4380-aa94-d8930e2f97c2+single-tenant+vParquet3: permission denied"
level=error ts=2024-06-28T14:46:00.761519343Z caller=rate_limited_logger.go:27 msg="pusher failed to consume trace data" err="rpc error: code = Unknown desc = mkdir /var/tempo/wal/82e00424-ff58-4a66-b83b-ecd8f4cdb8c1+single-tenant+vParquet3: permission denied"
level=error ts=2024-06-28T14:46:01.763387853Z caller=rate_limited_logger.go:27 msg="pusher failed to consume trace data" err="rpc error: code = Unknown desc = mkdir /var/tempo/wal/2bf4ee9e-9701-43b7-858e-f469229ec921+single-tenant+vParquet3: permission denied"
level=error ts=2024-06-28T14:46:02.76407992Z caller=rate_limited_logger.go:27 msg="pusher failed to consume trace data" err="rpc error: code = Unknown desc = mkdir /var/tempo/wal/a9b65782-2bc0-482c-a185-84eb36cd7301+single-tenant+vParquet3: permission denied"
pavolloffay commented 2 months ago

PR in the tempo helm chart https://github.com/grafana/helm-charts/pull/3161#issuecomment-2197202507

pavolloffay commented 2 months ago

The issue does not seem to occur on OCP. The OCP seems to be setting this securityContext by default

  securityContext:
    fsGroup: 1001020000
    seLinuxOptions:
      level: s0:c32,c14
    seccompProfile:
      type: RuntimeDefault

Tested with

kubectl apply -f - <<EOF
apiVersion: tempo.grafana.com/v1alpha1
kind: TempoStack
metadata:
  name: simplest
spec:
  managementState: Managed
  images:
    tempo: docker.io/grafana/tempo:2.4.1
  storage:
    secret:
      name: minio-test
      type: s3
  storageSize: 1Gi
  resources:
    total:
      limits:
        memory: 2Gi
        cpu: 2000m
  template:
    queryFrontend:
      jaegerQuery:
        enabled: true
EOF

reported some data and then changed to docker.io/grafana/tempo:2.5.0

~ $ ls -al /var/tempo/
total 28
drwxrwsr-x    4 root     10010200      4096 Jul  1 16:57 .
drwxr-xr-x    1 root     root            19 May 31 15:10 ..
drwxrws---    2 root     10010200     16384 Jul  1 16:54 lost+found
-rw-r--r--    1 10010200 10010200      1388 Jul  1 16:57 tokens.json
drwxrwsr-x    4 10010200 10010200      4096 Jul  1 17:07 wal
pavolloffay commented 2 months ago

I think adding an initContainer would solve the timing issues. init containers always start and finish before the main container, and afaics (correct me if I'm wrong) while the init container of the new statefulset runs, the old statefulset is already terminated (otherwise two pods access the same PV, I think this is prevented).

There is an issue with the init-container approach. The init-container will run only for ingesters (and PVs) that are scheduled to run. The ingesters might be scaled down (e.g. from 5 instances to 2) therefore there will be PVs that won't be upgraded.

andreasgerstmayr commented 2 months ago

I think adding an initContainer would solve the timing issues. init containers always start and finish before the main container, and afaics (correct me if I'm wrong) while the init container of the new statefulset runs, the old statefulset is already terminated (otherwise two pods access the same PV, I think this is prevented).

There is an issue with the init-container approach. The init-container will run only for ingesters (and PVs) that are scheduled to run. The ingesters might be scaled down (e.g. from 5 instances to 2) therefore there will be PVs that won't be upgraded.

We'll have to keep the initContainer indefinitely, so when the ingesters are scaled down it doesn't matter if the PV is in an outdated state, but when they get scaled up again they will be upgraded.

pavolloffay commented 2 months ago

We'll have to keep the initContainer indefinitely

This is certainly doable but it will clutter non upgrade related packages. I would prefer to keep the upgrade code in an isolated standalone package.

I have changed the upgrade procedure to scale ingesters down to zero before upgrading, then wait until a single job that mounts all volumes finishes.