Performance analysis on job startup time

ash211 commented 7 years ago

Startup time is very important for @justinuang on an internal application. Our target time: 10sec between spark-submit and job running in k8s.

Current analysis:

scenario:
- SparkPi startup
- examples jar added via addJar process (local to submitter)
- docker images already cached on all k8s nodes
cluster:
- Amazon AWS
- 5 m4.xlarge instances
- all in us-east-1b
- layout:
- submitter on one instance
- apiserver on one instance
- the remaining three instances with kubelets only
23 seconds total between spark-submit start and executors fully running with Spark job starting
- 2 sec from spark-submit start to driver pod requested from k8s (JVM startup time)
- 2 sec from driver pod requested from k8s to driver pod running (k8s pod startup time)
- 2 sec from driver pod running to first log line out of rest server JVM (JVM startup time)
- 8 sec from rest server JVM running to submitter submitting app jar (polling/watch time)
- 2 sec from app jar submitting to app jar running in driver pod (subprocess JVM startup time)
- <1 sec from app jar started (first log line out) to requesting executors from k8s (Spark code)
- 7 sec from executor pods requested to all executor JVMs ready and pinged back to driver (k8s pod startup time + JVM startup time)

To get better numbers I'd want to turn on millisecond level logging (default is just at the second level).

We think the bolded line (time between rest server JVM ready and submitter submitting app jar) is the place for most improvement. Fully eliminating that would get us to 15sec job startup time.

The next place to pursue further improvements after that might be in merging rest server JVM and driver JVM on the driver pod into the same JVM (reduces the ~2sec JVM startup time).

lins05 commented 7 years ago

merging rest server JVM and driver JVM on the driver pod into the same JVM

Actually that's what YARN cluster mode does, so +1 for it.

foxish commented 7 years ago

That looks awesome. Thanks @ash211 for running those tests. It also verifies that it runs on AWS without issues here, which is great.

mccheah commented 7 years ago

Does the 8 seconds include both bypassing all of the futures and also getting past the initial ping of the remote server? It would be good to distinguish between the time spent on:

Starting the Kubernetes components at all (watches + futures), and
Being able to use said kubernetes components in practice (ping + required retries)

ash211 commented 7 years ago

re AWS: this was using plain EC2 with a kubeadm-created cluster, not their container service ECS. But it is good indication that it at least works somewhat in AWS.

For the 8 seconds, that was for both the watches to trigger the futures, as well as the ping. I'm not sure the breakdown between them since I was running a slightly-behind version of our branch that didn't have the logging on k8s resource readiness + ping verification. Will re-run with latest and post new stats.

apache-spark-on-k8s / spark

Performance analysis on job startup time #113