Open ndrluis opened 2 years ago
As another user of wrangler, I strongly agree. Many functionalities are already implemented in wrangler. I think this codebase can be a thin wrapper around wrangler to make it compliant to the Singer protocol.
Definitely worth consideration, especially as there is some discussion of rewriting the entire target at some point.
I've also come across https://github.com/akamai/pallas if anyone is familiar and can compare/contrast.
I created a target-s3-parquet using aws data wrangler to solves our problems with target-athena https://github.com/gupy-io/target-s3-parquet
The codebase has some hardcoded configuration, but we pretend to evolve.
Hello people, I'm starting to use this target and I'm missing some features that I'm already working to make some contributions here, but I think that we can make this codebase more simpler using AWS Data Wrangler instead of pyathena.
IDK if anyone here has worked before with this library, but aws data wr abstracts all the AWS calls and catalog/database manipulation and data upload to s3 making easier to implement the parquet writer #9 for example.
Can we discuss about?
References: https://aws-data-wrangler.readthedocs.io/en/stable/tutorials/006%20-%20Amazon%20Athena.html https://aws-data-wrangler.readthedocs.io/en/stable/tutorials/005%20-%20Glue%20Catalog.html https://aws-data-wrangler.readthedocs.io/en/stable/tutorials/003%20-%20Amazon%20S3.html https://aws-data-wrangler.readthedocs.io/en/stable/tutorials/012%20-%20CSV%20Crawler.html https://aws-data-wrangler.readthedocs.io/en/stable/tutorials/017%20-%20Partition%20Projection.html