duckdb / duckdb_azure

Azure extension for DuckDB
MIT License
50 stars 17 forks source link

Add devicecode authentification #53

Closed quentingodeau closed 2 months ago

quentingodeau commented 7 months ago

Hello @samansmink,

If you have some time I would like your opinion on this PR.

This add the capability to use a generate a delegation token for a user I think it handle the issue #20 & #53. It also allow user to connect without SPN to Azure for example if you use DuckDB Wasm.

but I have two question:

  1. Is that really a good idea to implement it?
  2. How to write unit tests for this... Regards, Quentin
samansmink commented 7 months ago

Hey @quentingodeau,

I'm slightly unsure here. I appreciate a lot that you got it working, but indeed I think testing this will be a pain in the ass, if not completely impossible. That then leaves us with three options:

I think not testing it ever is a bad idea: then it will inevitably just break. We could include some detailed instructions on how to test this, but tbh im not sure it is worth the hassle. I can imagine this solves somebodies usecase somewhere, but I'm not yet convinced this common enough to justify the effort of maintaining this. Therefore I would opt to go for closing this PR for now and potentially revisiting it if it turns out to be a common usecase for a lot of people.

Happy to discuss though, if you feel im missing something here

robert-porter commented 7 months ago

At a minimum there should to be a way use a bearer token.

Even if it's something that doesn't do any token exchange in the code, and doesn't handle refresh tokens. This could be the user's responsibility. While this wouldn't really support any authentication flows, in a way this would support them all.

I want to use Entra for authentication but I want to do it with a browser redirect, not a device code. This would never work as an part of an extension, there's basically no correct way to do redirects from inside of duckdb(an I want to run on a server, so even more impossible).

Also while some kind of continuous integration testing for a bearer token may be tricky, the scope and quantity of code would be practically nothing, all you're doing is passing through a token.

quentingodeau commented 7 months ago

Hi @samansmink,

I understand your point of view and in reality I do not really care about this PR I personally do not need it at all (I have done it mostly to learn things 🙂). Also maybe I misunderstood but it seen that it was something needed by some cf the previous mentioned issues. The only use case that I'm thinking of is the case when you are running duck in a remote jupyter notebook or something like that.

Nevertheless I think that the capability to have user impersonalization can make sense in some organization. Depending on the sector you are working on, sometimes for security concerns the storage logs are activate and then monitors to check the different access (in case of data leakage).

But yes you are right on the tests part and the fact it will break. I have no counterargument...

quentingodeau commented 7 months ago

Hello @robert-porter,

May I ask why the device code flow is not ok for you? From my point of view I do not see what limitations it hold and that the browser redirection will handles.

robert-porter commented 7 months ago

@quentingodeau

I use duckdb embedded in a cloud application to transform data in other people's storage accounts. Basically the interactive flow is easier for the end user and me, it's just more appropriate in general for my case. The front end is a browser.

Currently there is no possible way to use user impersonation. I just want user impersonation to actually be possible. I'm advocating that if this is getting rejected due to the difficulty of testing, is there like a compromise solution of just getting like a bearer token credential type. This would make every single user impersonation flow type at least possible.

samansmink commented 2 months ago

Hey @quentingodeau! I'm going to close this PR: I think we can conclude that this is really hard to test code that is not worth the hassle to maintain right now