Open YathishK opened 5 years ago
It looks like you're using one of the example config files to submit a job using spark-submit. The examples assume you're running Spark locally, so the key checkpointPath
is set to /tmp/spark_checkpoint/
. If you're running Spark in cluster mode, you should instead set checkpointPath
to a location on HDFS. For example hdfs:///my-project-name/checkpoints/
.
You should also ensure that the output (MCMC samples, saved state etc) is saved to HDFS when running in cluster mode. To do this, you'll need to change the outputPath
setting to a HDFS URI.
Incidentally, we should probably make checkpointPath
an optional setting so that it falls back to the default if not specified.
When running in yarn mode , it has below warning message.
WARN SparkContext: Spark is not running in local mode, therefore the checkpoint directory must not be on the local filesystem. Directory '/tmp/spark_checkpoint/' appears to be on the local filesystem.