awslabs / athena-glue-service-logs

Glue scripts for converting AWS Service Logs for use in Athena
Apache License 2.0
142 stars 46 forks source link

Bring current with changes in Glue #23

Open dacort opened 3 years ago

dacort commented 3 years ago

I left and returned to AWS. Since that time, Glue has:

I want to research what impact these changes have and if they should be incorporated. Initial thoughts are that:

In addition, I need to go through the backlog. :)

CalvinLeather commented 3 years ago

As a band-aid, opened this PR while waiting on python 3 upgrade. Happy to help contribute to / test python 3 upgrade, our team is using tool actively.

dacort commented 3 years ago

Thanks @CalvinLeather ! Much appreciated. I hope to be able to devote some time to this in the coming weeks.

dacort commented 3 years ago

Adding a note here about blueprints - they could be useful for building more comprehensive Glue deployments for this project, specifically workflows which could schedule the jobs.

dacort commented 3 years ago

Looked into Blueprints a little bit yesterday. Looks like they could successfully be used to bootstrap Classifiers, Crawlers, and table definitions. In addition a schedule/trigger can also be set up so they could be a good end-to-end direction to go.

We can still make use of this library – all of the table definitions and classifiers are still necessary, as are some of the source-specific transforms.

We could essentially parameterize the whole thing:

image

image

brandon-fryslie commented 2 years ago

Is there any documentation on how to achieve what this library achieves with AWS Glue directly?