radanalyticsio / oshinko-s2i

This is a place to put s2i images and utilities for spark application builders for openshift
Apache License 2.0
15 stars 27 forks source link

Prometheus Metrics for Spark Driver with Single Env var #280

Open zak-hassan opened 5 years ago

zak-hassan commented 5 years ago

Background

We currently have an environment variable when set to --metrics='prometheus' then the spark master and driver are instrumented. However we may be missing out on some good metrics coming out of the running driver application.

Proposal

Why don't we include the agent-bond.jar and the agent-config.yaml into the s2i images and when the same environment variable is set: --metrics=='prometheus' then lets have s2i automatically setup the java-agent to instrument the application.

Details:

The driver would need to passed in the following as spark options:

SPARK_MASTER_URL= # spark://10.230.8.242:7077

spark-submit --driver-java-options "-javaagent:/opt/spark-2.4.0-bin-hadoop2.7/metrics/agent-bond.jar=/opt/spark-2.4.0-bin-hadoop2.7/conf/agent-d.properties"  --conf spark.driver.extraJavaOptions=-javaagent:/opt/spark-2.4.0-bin-hadoop2.7/metrics/agent-bond.jar=/opt/spark-2.4.0-bin-hadoop2.7/conf/agent-d.properties  --master    $SPARK_MASTER_URL examples/src/main/python/pi.py
zak-hassan commented 5 years ago

@elmiko @tmckayus Let me know what you think?

elmiko commented 5 years ago

my first thought is that this seems like a reasonable idea.

second thought, since this image is based on radanalyticsio/openshift-spark it should already have the jar file in it. this might be a simple matter of locating it and adding the necessary option to start the metrics on the command line like you suggest.

zak-hassan commented 5 years ago

Thats really good. I think that PR with the metrics config just got merged. Did you get a chance to cut a new image for that? I'd like to test drive this.

elmiko commented 5 years ago

the metrics config pr you posted did get merged into the master, but we have not cut a new release from that.

there is an autobuild that gets generated at quay.io/radanalyticsio/openshift-spark:master and quay.io/radanalyticsio/openshift-spark-py36:master for that repo, but it looks like the transitive dependencies (ie this repo) have not been rebuilt.

if you want to play around with metrics and see what you can do with the s2i you will need to generate a new s2i image locally.

hope that helps!

zak-hassan commented 5 years ago

It's in there. Perfect. Thanks @elmiko