Closed kailuowang closed 7 years ago
Released in 0.6.0. I've changed some key names and align the json format to the one described here http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure-apps.html
I noticed that using an S3 JSON file was replaced in this commit: https://github.com/pishen/sbt-emr-spark/commit/d4947b3afb86e99390e8e896bec2add1fa99a514
I like the EmrConfig object but in our case the json file is generated elsewhere and used by multiple clusters. If we were to use EmrConfig we'd be stuck always having to match our config to the s3 file anyways. Are you open to reintroducing this config to go along with the EmrConfig class?
an alternative is to use my fork which kept that functionality. https://github.com/kailuowang/sbt-emr-spark which is released under a different org.
I saw your fork, thanks for the recommendation. I'll probably use that in the near term, but just figured this is still a useful feature to bring back upstream.
Will try to get this back if possible.
@log0ymxm The feature is reintroduced in 0.11.0, EmrConfig
can now parse a JSON array, or read the JSON config directly from S3:
import sbtemrspark.EmrConfig
sparkEmrConfigs := Some(
EmrConfig
.parseJsonFromS3("s3://your-bucket/your-config.json")(sparkS3ClientBuilder.value)
.right
.get
)
ref: https://github.com/pishen/sbt-emr-spark#use-emrconfig-to-configure-the-applications
Thanks for reintroducing this. Looks like the interface for using this is a great choice.
Similar to what you can do when creating a cluster through the Web UI.
Note: I am working on this one.