awslabs / amazon-emr-cli

A command-line interface for packaging, deploying, and running your EMR Serverless Spark jobs
Apache License 2.0
35 stars 12 forks source link

Don't find option to configure external jars #34

Closed mayurchoubey closed 10 months ago

mayurchoubey commented 10 months ago

What's the way to add external jars in Spark application. My requirement is to add Delta Lake jars.

--conf spark.jars=s3://DOC-EXAMPLE-BUCKET/jars/delta-core_2.12-1.1.0.jar

dacort commented 10 months ago

There is a --spark-submit-opts option that allows you to provide spark-submit options.

For your example, you could do:

emr run .. --spark-submit-opts "--jars 3://DOC-EXAMPLE-BUCKET/jars/delta-core_2.12-1.1.0.jar"

Or you could use --spark-submit-opts "--conf spark.jars=s3://DOC-EXAMPLE-BUCKET/jars/delta-core_2.12-1.1.0.jar

I also have an example of this in the emr-serverless-samples repo, but it hasn't been merged yet: https://github.com/aws-samples/emr-serverless-samples/pull/50/files

Feel free to re-open if that doesn't work!