Hi there,
I'm new to Dataflow-runner but having a lot of trouble getting started. I'm using the Mac version of dataflow-runner to start my EMR instance in AWS, using a simple cluster.json at the bottom of this post
After running the command ./dataflow-runner run-transient --emr-config=cluster.json --emr-playbook=playbook.json
I try to submit a spark job using spark-submit in the playbook but it keeps failing with the error
"Cannot run program "spark-submit" (in directory "."): error=2, No such file or directory"
I would normally ask this somewhere like stack-overflow but looking in Cloud-Trail I see that Dataflow-runner is sending the command below
I can verify in EMR that none of my configuration from cluster.json is sent, and that the Spark application is not installed. It seems to be a valid cluster configuration but none of it is being sent to EMR. Did I perhaps set this up improperly or is this an issue?
Hi @danrods. Sorry we left this without attention, but we use Github for bug reports and feature requests only. Please consider posting this on our support forums.
Hi there, I'm new to Dataflow-runner but having a lot of trouble getting started. I'm using the Mac version of dataflow-runner to start my EMR instance in AWS, using a simple
cluster.json
at the bottom of this postAfter running the command
./dataflow-runner run-transient --emr-config=cluster.json --emr-playbook=playbook.json
I try to submit a spark job using
spark-submit
in the playbook but it keeps failing with the error"Cannot run program "spark-submit" (in directory "."): error=2, No such file or directory"
I would normally ask this somewhere like stack-overflow but looking in Cloud-Trail I see that Dataflow-runner is sending the command below
I can verify in EMR that none of my configuration from cluster.json is sent, and that the Spark application is not installed. It seems to be a valid cluster configuration but none of it is being sent to EMR. Did I perhaps set this up improperly or is this an issue?
Thanks in advance