duckdb / duckdb_azure

Azure extension for DuckDB
MIT License
50 stars 17 forks source link

Support for Device Code Flow Authentication #52

Closed devbrunorm closed 8 months ago

devbrunorm commented 8 months ago

As suggested at this link, I'm creating an issue for the possible implementation of the authentication feature via Device Code Flow. I noticed that this flow is not exclusive to Azure, but it comes from OAuth 2.0. In it, we have two tokens as a response of the process: a Bearer token and a refresh token (to reauthenticate when the Bearer token expires). However, I think it doesn't make sense to implement reauthentication within DuckDB, I believe it makes more sense being a user's responsability. For more information on how this flow works in Azure, please refer to this link:

https://learn.microsoft.com/en-us/entra/identity-platform/v2-oauth2-device-code

quentingodeau commented 8 months ago

Hello,

I have take a deeper look into this at the moment I have two issues:

  1. The azure SDK en C++ is not at the same level than the Java one so there is no DeviceCodeCredential
  2. We have to find a way to get the code to the end user

I will try to take a look. Regards, Quentin

devbrunorm commented 8 months ago

@quentingodeau I have been thinking about the cli authentication. How to do it on DuckDB? It uses some kind of access token? If so, maybe this will solve our problem.

quentingodeau commented 8 months ago

Yes the CLI should work if you are using DuckDB in local. Just perform

  1. az login
  2. Start duck
    duckdb
  3. Define the flowing secret
    CREATE SECRET az_cli(
     TYPE azure, 
     PROVIDER CREDENTIAL_CHAIN, 
     CHAIN 'cli'
    );
  4. And finally perform your queries :)
    SELECT *
    FROM 'abfss://adlsducktests.dfs.core.windows.net/qgo/tmp.parquet';
devbrunorm commented 8 months ago

I tried to do this, but it keeps giving auth errors. I dumped my duckdb_secrets table, but it appears that the secret is empty. Is that correct? Or I should give more parameters to the CREATE SECRET command?

secrets.csv

The command I tried to execute was this:

SELECT * FROM 'abfss://<blob_container>@<storage_account>.dfs.core.windows.net/<path>/sap.csv';

And I got this error:

IO Error: AzureBlobStorageFileSystem could not open file: 'abfss://<blob_container>@<storage_account>.dfs.core.windows.net/<path>/sap.csv', unknown error occurred, this could mean the credentials used were wrong. Original error message: 'Fail to get a new connection for: https://<blob_container>@<storage_account>.blob.core.windows.net. Problem with the SSL CA cert (path? access rights?)'

I have also tried this:

SELECT count(*) FROM 'abfss://⟨my_storage_account⟩.dfs.core.windows.net/⟨my_filesystem⟩/⟨path⟩/⟨my_file⟩.⟨parquet_or_csv⟩';

And I got this error:

Invalid Error: Fail to get a new connection for: https://<storage_account>.dfs.core.windows.net. Problem with the SSL CA cert (path? access rights?)

quentingodeau commented 8 months ago

You are under Linux isn't it ?

Regards, Quentin

devbrunorm commented 8 months ago

Finally, it works! Thanks for all the help!