Open jayantshekhar opened 5 years ago
Unfortunately, we don't have any plans to upgrade the current Spark version, but we are always re-evaluating our roadmap based on customer feedback!
Thanks for that Lauren!
Trying to understand it. Is Spark-SageMaker on the roadmap and would you recommend users to continue building solutions on it?
Is there something else you would like us to go with when integrating with SageMaker especially when running jobs on EMR?
Please refer to our documentation for Spark support. https://docs.aws.amazon.com/sagemaker/latest/dg/apache-spark.html We are evaluating our roadmap and will add support for the latest version in the future.
Thanks a lot Nadia! Will keep an eye on it and look forward to support for Spark 2.3 and Spark 2.4.
I had issues running sagemaker_pyspark on EMR 5.22 per this closed issue. I was able to have it work with no issue and confirm this with an AWS tech support. The changes I had to apply are listed in my comments in the closed issue linked above. Figured I'd also post here in case it can benefit anyone else.
One question though. It appears that sagemaker_pyspark SDK is not updated as often as sagemaker python SDK. Should we not be concerned because sagemaker_pyspark is a wrapper for sagemaker python SDK; or is it indeed lower priority in your roadmap and therefore receives less support?
System Information
Describe the problem
EMR clusters which use spark 2.3 and later have newer versions of sagemaker spark jars.
However they are not available on maven central : https://mvnrepository.com/artifact/com.amazonaws/sagemaker-spark
When is the plan to release to maven central for spark 2.3 and later? Or any recommendations for running on later EMR versions of the cluster.
Minimal repo / logs