Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
Yes, we use the shuffle pod IP to identify the shuffle pod and set spark.shuffle.service.host to the IP. So it seems shuffle pods need sticky network identify.
How to reproduce
See the log in driver/executor, it shows pod always try to fetch block using old shuffle-pod-ip