spektom / spark-flamegraph

Easy CPU Profiling for Apache Spark applications
Apache License 2.0
45 stars 12 forks source link
apache-spark cpu-profiling flamegraph spark

spark-flamegraph

Build Status

Easy CPU Profiling for Apache Spark applications.

The script spark-submit-flamegraph is a wrapper around standard spark-submit that generates Flame Graph.

Supported Systems

Prerequisites

The script is adapted for work in Amazon EMR. Otherwise the following utilities must present on your system:

Running

wget -O /usr/local/bin/spark-submit-flamegraph \
  https://raw.githubusercontent.com/spektom/spark-flamegraph/master/spark-submit-flamegraph

chmod +x /usr/local/bin/spark-submit-flamegraph

Use spark-submit-flamegraph as a replacement for the spark-submit command.

Configuration

To configure use the following environment variables:

Environment Variable Description Default value
SPARK_CMD Spark command to run spark-submit
PYTHON Path to the Python executable python2.7
PIP Path to the pip utility pip

For example, to profile Spark shell session set SPARK_CMD environment variable:

SPARK_CMD=spark-shell /usr/local/bin/spark-submit-flamegraph

Details

The script does the following operations to make profiling Spark applications as easy as possible: