Always check access control before metadata

huw0 commented 4 months ago

I've noticed that Trino often checks database metadata before access control.

This means that when a query is made against an object that the user does not have permission for:

If a user does not have access to the underlying database (e.g when the connector uses impersonation)...
- then a JDBC exception will be returned rather than a Trino access denied message.
- underlying databases will throw misleading access denied audit events. No connection should have taken place.
Where a user does have access to the underlying database, but shouldn't have access in Trino then the error returned is that the table does not exist, rather than access is denied. This means a user can determine database objects that exist but are not visible to them.
Queries that a user does not have permission to run take longer to fail than they should.

Configure a catalog that not visible to a user. The following queries result in database calls showing the issues listed above:

USE hiddencatalog.schemaThatExists;
USE hiddencatalog.schemaThatDoesNotExist;
SELECT * FROM hiddencatalog.schemaThatExists.table;
SELECT * FROM hiddencatalog.schemaThatDoesNotExist.table;

This is not an exhaustive list.

Suggested fixes...

[ ] #22805
[ ] Modify access control ordering in StatementAnalyzer.
[ ] Confirm this does not occur elsewhere? Can tests be written to check for occurrences where access is checked after metadata access?

huw0 commented 4 months ago

I've raised a PR for the simplest of these changes in UseTask. The changes for StatementAnalyzer are quite extensive, so I'd prefer that this is looked at by someone who is more familiar with the Trino internals.

hashhar commented 4 months ago

cc: @dain @kokosing

Where a user does have access to the underlying database, but shouldn't have access in Trino then the error returned is that the table does not exist, rather than access is denied. This means a user can determine database objects that exist but are not visible to them.

I did not understand this. Doesn't throwing "table does not exist" actually prevent someone from enumerating which objects exist which he simply doesn't have access to? If the error was "access is denied" wouldn't they know the existence of the object?

I don't know how @dain considers "leaking existence or absence of object" from security standpoint though.

trinodb / trino

Always check access control before metadata #22804