Open rustyconover opened 5 months ago
Hey @rustyconover! Thanks a lot for the PR's!
To review this I will need to setup some aws glue table table myself to test it out, I will try to find some time tomorrow to do this.
One small comment I do have already is that I'm not sure the json string is the neatest way of passing the configuration to the Iceberg scan function. Maybe we can instead just add all of them as named_parameters to the iceberg table function. I think many of these will be shared among catalog_type
s anyway and that way the parser will help give meaningful error messages and syntax highlighting of the SQL strings works better.
Hi @samansmink,
I'll look at changing to named parameters and post a revised PR.
Rusty
Hi @samansmink,
I've changed things around to use named parameters and added the support so that the iceberg_metadata()
function can also use the same configuration.
Rusty
You can now run queries that look like this:
select * from iceberg_scan('users', catalog_type="glue", region="us-east-1", database_name="test_iceberg");
select * from iceberg_metadata('users', catalog_type="glue", region="us-east-1", database_name="test_iceberg");
@rustyconover - Thank you for this PR and #50. I have access to Iceberg tables on AWS Glue and can help testing this feature. Is it possible to provide a binary or docker image for this PR? I'm having issues building Duckdb locally. If the binary will contain #50, I can test that one as well.
Hi @harel-e,
Thank you for your kind words.
Unfortunately I can't help you build the extension or package it as a Docker container. You might want to try asking on the DuckDB discord for help building DuckDB.
I'm building it on Mac OS X. I had to make some changes to vcpkg to work around the fall out of the xz package unavailability with boost.
Rusty
vcpkg should be restored again from the xz debacle afaik! Check out https://github.com/duckdb/extension-template for some instructions on setting up vcpkg for extension builds.
I tested this branch on AWS with several Iceberg tables.
This query pattens works fine: select * from iceberg_scan('users', catalog_type="glue", region="us-east-1", database_name="test_iceberg");
Hoping to see it in the upcoming 0.10.3
Thank you @rustyconover for this wonderful addition. DuckDB is now one step closer to work seamlessly in AWS
Sorry for the absence here, I've been really busy
There are still some problems remaining with CI here on windows and linux amd64, those would need to be fixed for this to get merged before 0.10.3
I'll take a look at the linux build failures, but the windows ones I don't have access to that platform.
@rustyconover : Does this support Nessie catalog for iceberg?
Any chance to make this PR merged?
@rustyconover are you still working on this? would it make sense for someone to pick this up?
I'm not actively working on this PR, feel free to finish it up.
Add support for accessing tables stored at AWS Glue.
Example SQL call:
Added the framework for more additional external Iceberg catalog:
This JSON object should be of this format: