Open ash211 opened 7 years ago
merging rest server JVM and driver JVM on the driver pod into the same JVM
Actually that's what YARN cluster mode does, so +1 for it.
That looks awesome. Thanks @ash211 for running those tests. It also verifies that it runs on AWS without issues here, which is great.
Does the 8 seconds include both bypassing all of the futures and also getting past the initial ping
of the remote server? It would be good to distinguish between the time spent on:
re AWS: this was using plain EC2 with a kubeadm
-created cluster, not their container service ECS. But it is good indication that it at least works somewhat in AWS.
For the 8 seconds, that was for both the watches to trigger the futures, as well as the ping. I'm not sure the breakdown between them since I was running a slightly-behind version of our branch that didn't have the logging on k8s resource readiness + ping verification. Will re-run with latest and post new stats.
Startup time is very important for @justinuang on an internal application. Our target time: 10sec between spark-submit and job running in k8s.
Current analysis:
To get better numbers I'd want to turn on millisecond level logging (default is just at the second level).
We think the bolded line (time between rest server JVM ready and submitter submitting app jar) is the place for most improvement. Fully eliminating that would get us to 15sec job startup time.
The next place to pursue further improvements after that might be in merging rest server JVM and driver JVM on the driver pod into the same JVM (reduces the ~2sec JVM startup time).