aws / sagemaker-spark-container

The SageMaker Spark Container is a Docker image used to run data processing workloads with the Spark framework on Amazon SageMaker.
Apache License 2.0
36 stars 74 forks source link

Add SageMaker Feature Store Spark connector to containers #102

Open tonykchen opened 1 year ago

tonykchen commented 1 year ago

The SageMaker Feature Store Spark connector enables scalable data ingestion into SageMaker Feature Store. In order to use the connector in a SageMaker Processing Job, one must extend a pre-built image with the dependency as shown here.

Including the connector as part of the pre-built images would make it easier to get started with using Spark for ingest.