Closed samansmink closed 1 year ago
Thanks for the review @carlopi! There is no requirement on httpfs! the azure sdk ships its own http stuff.
The account name is a "storage account" name which is part of the url for the objects: https://<account_name>.blob.core.windows.net/<container_name>/file.ext
so that is required afaik!
will check out the rest tomorrow!
Another question: would it not be nice to offer a load_azure_credentials('my_name', 'cli')
option?
@carlopi That's a good point. Indeed we can pass a profile name to the cli credential provider. Currently it just picks the default.
For this PR i would argue this is good enough. We can consider improving this in a follow up
In the grand scheme of things: this comes down to the major question of proper API for table scans with credentials. What we would like to support is to basically be able to create and switch between multiple sets of credentials
This PR adds some improvements to how to authenticate in the Azure extension.
This is the first step towards implementing the feature request in https://github.com/duckdblabs/duckdb_azure/issues/6.
Interface changes
The complete set of parameters for the extension is now:
azure_connection_string
azure_account_name
azure_credential_chain
azure_endpoint
To authenticate the requests, the extension does, in this order:
azure_credential_chain
for which credential providers to check in order;
separated string one or more of the following:cli
,env
,managed_identity
,default
ornone
. Checkout the Azure docs on how to use them (https://github.com/Azure/azure-sdk-for-cpp/blob/main/sdk/identity/azure-identity/README.md#environment-variables). For example to use the full DefaultAzureCredential chain, set it to 'default'blob.core.windows.net
but can be overridden throughazure_endpoint
Examples
These changes simplify some things:
Public containers:
For fully public containers, reading and listing is now much cleaner not needing to set :
CLI
Authentication using the cli can now work as follows:
Firstly, using the azure cli log in to your azure account using:
This will forward you to a browser and ask you to login.
Now from duckdb, run:
TODO
Much more testing. The CLI authentication is tested but only locally, not in CI atm. The few other credential providers that are included in this PR (env, managed_identity), they are fully untested.
However, I feel like this is quite finicky to test properly test anyway, due to the different types of environments that you need to setup to test this. Since its only a little amount of code on DuckDB side with just a few lines, and the CLI log-in working very well, I think pushing this into the nightly repo allowing people to try it out on 0.9.1 is a nice way forward. Also to try out the whole workflow of deploying nightly releases for extensions allowing people to use a stable duckdb + an unstable/nightly extension.
Of course the next step is to add more testing infrastructure: