Spark dependencies job failing (and constantly being re-created) [cassandra]

rrichardson commented 4 years ago

I have jaeger operator running (quite well, I might add) on a k8s cluster with a small, 3 node cassandra cluster.
Initial setup works just fine. It is handling the load of 100% sampling in our cluster.

However, for some reason, the operator seems to think it needs to keep re-running the spark-dependencies job, and it keeps failing. I don't know if it's ever succeeded, but I regularly delete the job just to shut up the alarms that it generates.
The error log file is attached.
jaeger.log

Here is my Jaeger manifest.


apiVersion: v1
items:
- apiVersion: jaegertracing.io/v1
  kind: Jaeger
  metadata:
    annotations:
    creationTimestamp: "2020-08-08T15:40:54Z"
    generation: 3
    labels:
      jaegertracing.io/operated-by: jaeger.jaeger-operator
      manager: kubectl
      operation: Update
      time: "2020-08-08T15:40:54Z"
    - apiVersion: jaegertracing.io/v1
      fieldsType: FieldsV1
      manager: jaeger-operator
      operation: Update
      time: "2020-08-08T15:41:03Z"
    name: core
    namespace: jaeger
    resourceVersion: "24044973"
    selfLink: /apis/jaegertracing.io/v1/namespaces/jaeger/jaegers/core
    uid: c93dbbef-e917-4713-becc-362acb1227b1
  spec:
    agent:
      config: {}
      options: {}
      resources: {}
    allInOne:
      config: {}
      options: {}
      resources: {}
    collector:
      config: {}
      options: {}
      resources: {}
    ingester:
      config: {}
      options: {}
      resources: {}
    ingress:
      enabled: false
      openshift: {}
      options: {}
      resources: {}
      security: none
    query:
      options: {}
      resources: {}
    resources: {}
    sampling:
      options:
        default_strategy:
          param: 100
          type: probabilistic
    storage:
      cassandraCreateSchema:
        datacenter: jaeger-us-east-1
        enabled: true
        mode: test
      dependencies:
        enabled: true
        resources: {}
        schedule: 55 23 * * *
      elasticsearch:
        nodeCount: 3
        redundancyPolicy: SingleRedundancy
        resources:
          limits:
            memory: 16Gi
          requests:
            cpu: "1"
            memory: 16Gi
        storage: {}
      esIndexCleaner:
        numberOfDays: 7
        resources: {}
        schedule: 55 23 * * *
      esRollover:
        resources: {}
        schedule: 0 0 * * *
      options:
        cassandra:
          servers: cassandra
      type: cassandra
    strategy: allinone
    ui:
      options:
        menu:
        - items:
          - label: Documentation
            url: https://www.jaegertracing.io/docs/1.18
          label: About
  status:
    phase: Running
    version: 1.18.1
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""