Open jacobsalway opened 6 months ago
This issue has been automatically marked as stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days. Thank you for your contributions.
I think this issue can still occur so commenting to mark as still active
Description
We use the operator to manage the lifecycle of both batch and Spark streaming applications. Streaming apps in particular are long lived and if using dynamic allocation may scale up and down over time, resulting in the creation of new executor IDs (see link below for incremented executor IDs inside Spark)
The operator tracks the state of each individual executor pod inside the
.Status.ExecutorState
field but these entries are never removed. For long lived streaming applications this map will eventually become so large that the operator the CR cannot be written back to etcd as it's over the request size limit.https://github.com/kubeflow/spark-operator/blob/master/pkg/controller/sparkapplication/controller.go#L367-L436
https://github.com/apache/spark/blob/d82458f15539eef8df320345a7c2382ca4d5be8a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala#L460
Reproduction Code [Required]
Create enough unique executors over the lifetime of a Spark application that eventually the CR becomes larger than the max etcd request size.
Expected behavior
The operator should not fail to write the CR back to etcd.
Actual behavior
If enough executor IDs are accumulated within a single application, eventually the operator may fail to write the CR back to etcd.
Terminal Output Screenshot(s)
I can't find any internal screenshots of how large the executor state map was but did find the etcd write failure log.
Environment & Versions
Additional context
Internally we found no one was using this field so effectively disabled the tracking. We use EKS so I cannot change the max size of an etcd request but EKS docs say it's 1.5 megabytes.