This docker image is used to run your glue etl jobs on your local environment. This helps in the development of etl jobs locally without incurring additional costs by running Glue Devendpoints or Glue jobs.
Having glue libraries locally helps in the development and making it easier for the developer to update/change the code and test it locally before commiting it to a job.
To build the image on your system, follow these steps:
git clone https://github.com/jnshubham/aws-glue-local-etl-docker.git
system start docker
docker build -t jnshubham/glue_etl_local .
Run
docker pull jnshubham/glue_etl_local:latest
Check downloaded image by running
docker images
To run the container and get into pyspark shell directly
docker run jnshubham/glue_etl_local "gluepyspark"
To get into the terminal and submit a job run
docker run -it jnshubham/glue_etl_local
gluesparksubmit script_name parameters
To checkout the image visit Docker page here
Thanks!