Closed sgtang closed 5 years ago
Which version of spark-nomad are you using?
We're on spark-2.4.0-bin-nomad-0.8.6-20181220.
Thank you. I'll look into this; I added support for spark.driver.extraJavaOptions
in this last release, it may be that extra options for the executors are still missing.
Any updates on this?
hi @georgehdd , thanks for asking. no update as of yet, but i'm in the middle of a new release of nomad-spark with some other fixes/features, and my plan is to roll this in as well.
Hi @cgbaker,
Any other way to pass extraJavaOptions
to executors?
I'm trying to change GC settings on executors and adding the spark.executor.extraJavaOptions
in the spark-defaults.conf
doesn't seem to be affecting either.
It may be possible by adding a JAVA_OPTS variable to the executor environment. This can be done by adding the following to the spark-submit command. For example:
--conf spark.executorEnv.JAVA_OPTS="-XX:-UseParallelGC" \
or, alternatively, adding the environment variable directly to the executor task in the job template (if you are using/willing to use a job template).
@cgbaker Tried both (conf parameter and job template) _JAVAOPTS doesn't seem to be having any effect on the executors.
Do you have an ETA on the new release which will include the spark.executor.extraJavaOptions
fix?
I'll try to release this week. Would 2.4.1 be sufficient or do you need 2.4.0?
On Sun, Jul 7, 2019, 02:29 georgehdd notifications@github.com wrote:
@cgbaker https://github.com/cgbaker Tried both (conf parameter and job template) JAVA_OPTS doesn't seem to be having any effect on the executors. Do you have an ETA on the new release which will include the spark.executor.extraJavaOptions fix?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hashicorp/nomad-spark/issues/21?email_source=notifications&email_token=AAMY6TZR7LXJGIHJX2WB5BTP6GLNPA5CNFSM4GTFLWV2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZLGGOQ#issuecomment-508977978, or mute the thread https://github.com/notifications/unsubscribe-auth/AAMY6T2GLGZEQWSIFF6GOG3P6GLNPANCNFSM4GTFLWVQ .
2.4.1
would be sufficient. Thanks!
Hi @cgbaker, any updates regarding the release?
I added it yesterday, doing testing today in preparation for a release. Btw, I targeted 2.4.3 instead of 2.4.1, because it's the latest.
2.4.3
is even better. Thanks.
How long does testing usually take?
Okay, I modified the Nomad scheduler to honor this config.
E.g., for the following config:
--conf spark.executor.extraJavaOptions="-XX:+PrintGCDetails -XX:+PrintGCTimeStamps"
you can see the executors running like so:
java ... -XX:+PrintGCDetails -XX:+PrintGCTimeStamps ...
The release is out. I've changed the versioning semantics, feel free to give feedback. The new release is 2.4.3.0: https://github.com/hashicorp/nomad-spark/releases/tag/v2.4.3.0
We've been attempting to collect executor metrics by running a .jar using the following configuration in our Spark submit:
It seems as if this configuration is being completely ignored by Nomad Spark and not being passed to the executor. We have to manually edit the job plan's SPARK_EXECUTOR_OPTS environment variable to include the javaagent statement to get the metrics to work.
Is there actually a way for us to pass this .jar to the executors from the spark submit without resorting to editing the job .json and doing an update? Moreover, if the Spark job gets killed, the environment variable is overwritten again, so doing it this way isn't very sustainable for us.