error writing object to s3 backend

Describe the bug On October 24, my tempo-ingester pods started throwing the below errors and ingester and compacter latency increased quite a bit (couple hundred ms to multiple seconds).

level=error caller=flush.go:233 org_id=single-tenant msg="error performing op in flushQueue" op=1 block=77c398c8-cc47-4764-a995-fe0de5760e7d attempts=1 err="error copying block from local to remote backend: error writing object to s3 backend, object tempo/single-tenant/77c398c8-cc47-4764-a995-fe0de5760e7d/data.parquet: context deadline exceeded"

This did not align with a software, config, or network change that I can tell. We are still writing to S3, but slowly. I can't tell if the deadline exceeded blocks get retried or dropped.

To Reproduce Steps to reproduce the behavior: Normal operation reproduces the behavior

Environment:

Infrastructure: Kubernetes running on bare metal. The local storage is provided by a Pure appliance. Long term storage is in S3.
Deployment tool: helm

➜ helm history tempo
REVISION    UPDATED                     STATUS      CHART                       APP VERSION DESCRIPTION     
140         Mon Oct 14 16:26:31 2024    superseded  tempo-distributed-1.18.4    2.6.0       Upgrade complete
141         Wed Nov 20 14:32:00 2024    superseded  tempo-distributed-1.22.1    2.6.0       Upgrade complete
142         Wed Nov 20 14:57:16 2024    deployed    tempo-distributed-1.22.1    2.6.0       Upgrade complete

Additional Context values.yaml overrides

USER-SUPPLIED VALUES:
compactor:
  config:
    compaction:
      max_time_per_tenant: 15m
  replicas: 12
  resources:
    requests:
      cpu: 600m
      memory: 2Gi
distributor:
  replicas: 6
  resources:
    requests:
      cpu: 2
      memory: 1500Mi
ingester:
  persistence:
    enabled: true
    inMemory: false
    size: 30Gi
    storageClass: null
  replicas: 30
  resources:
    requests:
      cpu: 1
      memory: 5Gi
memcached:
  replicas: 3
  resources:
    requests:
      cpu: 100m
      memory: 100Mi
memcachedExporter:
  enabled: true
metaMonitoring:
  serviceMonitor:
    enabled: true
metricsGenerator:
  enabled: false
prometheusRule:
  enabled: true
querier:
  config:
    max_concurrent_queries: 40
    search:
      query_timeout: 1m
    trace_by_id:
      query_timeout: 1m
  replicas: 40
  resources:
    requests:
      cpu: 50m
      memory: 2Gi
query_frontend:
  max_outstanding_per_tenant: 4000
queryFrontend:
  config:
    search:
      concurrent_jobs: 5000
  replicas: 2
  resources:
    requests:
      cpu: 10m
      memory: 150Mi
reportingEnabled: false
server:
  http_server_read_timeout: 4m
  http_server_write_timeout: 4m
storage:
  trace:
    backend: s3
    pool:
      queue_depth: 50000
    s3:
      access_key: *******************
      bucket: *****************
      endpoint: s3.us-west-2.amazonaws.com
      prefix: tempo
      secret_key: *********************
tempo:
  structuredConfig:
    overrides:
      defaults:
        ingestion:
          burst_size_bytes: 800000000
          max_traces_per_user: 3000000
          rate_limit_bytes: 600000000
traces:
  otlp:
    grpc:
      enabled: true
    http:
      enabled: true

grafana / tempo

error writing object to s3 backend #4361