Open dacort opened 3 years ago
As a band-aid, opened this PR while waiting on python 3 upgrade. Happy to help contribute to / test python 3 upgrade, our team is using tool actively.
Thanks @CalvinLeather ! Much appreciated. I hope to be able to devote some time to this in the coming weeks.
Adding a note here about blueprints - they could be useful for building more comprehensive Glue deployments for this project, specifically workflows which could schedule the jobs.
Looked into Blueprints a little bit yesterday. Looks like they could successfully be used to bootstrap Classifiers, Crawlers, and table definitions. In addition a schedule/trigger can also be set up so they could be a good end-to-end direction to go.
We can still make use of this library – all of the table definitions and classifiers are still necessary, as are some of the source-specific transforms.
We could essentially parameterize the whole thing:
Is there any documentation on how to achieve what this library achieves with AWS Glue directly?
I left and returned to AWS. Since that time, Glue has:
I want to research what impact these changes have and if they should be incorporated. Initial thoughts are that:
In addition, I need to go through the backlog. :)