aws-samples / aws-glue-samples

AWS Glue code samples
MIT No Attribution
1.42k stars 812 forks source link

Creating AWS- Glue Pipeline using Cloud Formation #111

Closed metadimensions closed 2 years ago

metadimensions commented 2 years ago

Hello,

I have created a pipeline manually using AWS Glue to import files from S3 to Aurora DB

Is there a way we can do this same via Cloud formations?

Thanks in advance.

fwanghe commented 2 years ago

I think you can. What you need to do is organizing glue jobs, triggers, crawlers in the workflow and attach the resources such as s3 to the crawler and db in the template.

moomindani commented 2 years ago

It is possible to provision Glue workflow (maybe you wanted to mention Glue workflow in "Glue pipeline") via multiple different options.

For example, this blog post has an example to provision Glue workflow via CloudFormation. https://aws.amazon.com/blogs/big-data/build-a-serverless-event-driven-workflow-with-aws-glue-and-amazon-eventbridge/

On the other hand, this blog post has an example to provision Glue workflow via Glue custom blueprint. https://aws.amazon.com/blogs/big-data/simplify-data-integration-pipeline-development-using-aws-glue-custom-blueprints/

BTW, this repo is mainly for AWS Glue samples. Please ask general questions in AWS Forum or AWS Premium Support.

metadimensions commented 2 years ago

Hey @moomindani

Thanks for the reply.

I tried the pipeline using Cloudformations but now I have a doubt on how to update the rows in RDS DB - Postgres using the same AWS Glue

For eg: If the CSV files from S3 Bucket has some rows updated so how can we update the same into the RDS DB because I tried running the job but it shows some duplication error.