voltrondata / sqlflite

An example Flight SQL Server implementation - with DuckDB and SQLite back-ends.
Apache License 2.0
191 stars 21 forks source link

What is the status of the authentication? #12

Closed pkit closed 9 months ago

pkit commented 9 months ago

As I can see from the code, the auth implementation uses a mix of "basic" and "bearer" auth with jwt tokens. Essentially starting with basic and then sending a jwt on the first success and falling back to jwt subsequently. Is there any spec on that? As far as I can see, the existing JDBC drivers connect fine, which means they do implement the same protocol. Do they? Probably they can get away with always using basic? Issuing 10 hour jwt tokens looks fishy too but that's probably just an example. btw, is TLS required?

pkit commented 9 months ago

Hmm, another interesting question is why AuthHandler is not used and is set to NoOp? It looks like AuthHandler.incoming_headers() should get you the functionality of HeaderAuthServerMiddlewareFactory for free. But maybe it doesn't?

prmoore77 commented 9 months ago

As I can see from the code, the auth implementation uses a mix of "basic" and "bearer" auth with jwt tokens. Essentially starting with basic and then sending a jwt on the first success and falling back to jwt subsequently. Is there any spec on that? As far as I can see, the existing JDBC drivers connect fine, which means they do implement the same protocol. Do they? Probably they can get away with always using basic? Issuing 10 hour jwt tokens looks fishy too but that's probably just an example. btw, is TLS required?

Hi @pkit - the current solution does indeed use a mix of basic and bearer auth. I found/stole/augmented the solution from the Arrow test codebase - here

The solution has been plagued by a bug I couldn't fix until recently that caused auth to permanently fail if the JWT was invalid for any reason (expired, wrong signature, etc.) - so that's why I had the long 10-hour lifetime for the token.

I believe a recent commit, however fixes this issue, so we could now shorten the JWT lifetime (maybe 1 hour? - what do you suggest?). That commit causes the solution to fall-back to basic auth if the bearer auth token is invalid for any reason. After basic auth works, the sever should issue a fresh, new, and valid JWT...

The JDBC drivers seem to work well with the basic then bearer auth solution - I've debugged the server while connecting from DBeaver - and all seems to work well. If you find any issues that I've missed, however - please feel free to open them here.

Regarding TLS - it isn't required by Flight SQL - but I made it mandatory here b/c of the sensitivity of the auth headers being sent by the JDBC/ADBC drivers. I just wanted this solution to be secure. Feel free to fork, however - and disable it if needed. I would like to force it here - as I don't want folks sending plain-text auth headers with this solution...

Thanks for your feedback! I appreciate you trying this out.

prmoore77 commented 9 months ago

@pkit

Hmm, another interesting question is why AuthHandler is not used and is set to NoOp? It looks like AuthHandler.incoming_headers() should get you the functionality of HeaderAuthServerMiddlewareFactory for free. But maybe it doesn't?

Hey @pkit - I'm not sure - this would be something that we should try. While the current solution "works" - I'm definitely open to improving and simplifying it.

pkit commented 9 months ago

So, there's no spec? :)

In my experience databases usually have their own credential management. Thus all that "bolt on" auth is rarely useful.

Anyway, my goal is to implement FlightSQL interfaces for ClickHouse.

  1. Expose ClickHouse as a FlightSQL server.
  2. Add a table function/storage engine to ClickHouse that can talk to other Flight/FlightSQL servers.

So far for ClickHouse, as a real database, none of the "helper" functions in FlightSqlServerBase are really useful. And it looks like I will need to go one level deeper, into the actual grpc server/service/transport layer, to implement all the needed hooks.

prmoore77 commented 9 months ago

So, there's no spec? :)

In my experience databases usually have their own credential management. Thus all that "bolt on" auth is rarely useful.

Anyway, my goal is to implement FlightSQL interfaces for ClickHouse.

  1. Expose ClickHouse as a FlightSQL server.
  2. Add a table function/storage engine to ClickHouse that can talk to other Flight/FlightSQL servers.

So far for ClickHouse, as a real database, none of the "helper" functions in FlightSqlServerBase are really useful. And it looks like I will need to go one level deeper, into the actual grpc server/service/transport layer, to implement all the needed hooks.

I'm not aware of a standard spec for security for Flight SQL. As you said, I think it depends on the underlying Database engines. In the case of DuckDB and SQLite (the engines used in this solution) - there is no auth at all - so this Flight SQL solution actually adds security.

If you are writing a Flight SQL Server solution for a database engine (such as ClickHouse) which has authentication built-in - I think you'll have to research how best to implement security for Flight SQL in such a case. I haven't yet attempted that...

prmoore77 commented 9 months ago

Part of my motivation for adding an auth layer for this example was this issue I opened in the DuckDB repo: https://github.com/duckdb/duckdb/issues/2519

prmoore77 commented 9 months ago

Hmm, another interesting question is why AuthHandler is not used and is set to NoOp? It looks like AuthHandler.incoming_headers() should get you the functionality of HeaderAuthServerMiddlewareFactory for free. But maybe it doesn't?

Good point! I had missed this. I will attempt to build a branch using the built-in auth-handler to see if it functions the same or better. Thanks for pointing it out...