trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.49k stars 3.02k forks source link

Always check access control before metadata #22804

Open huw0 opened 4 months ago

huw0 commented 4 months ago

I've noticed that Trino often checks database metadata before access control.

This means that when a query is made against an object that the user does not have permission for:

  1. If a user does not have access to the underlying database (e.g when the connector uses impersonation)...
    • then a JDBC exception will be returned rather than a Trino access denied message.
    • underlying databases will throw misleading access denied audit events. No connection should have taken place.
  2. Where a user does have access to the underlying database, but shouldn't have access in Trino then the error returned is that the table does not exist, rather than access is denied. This means a user can determine database objects that exist but are not visible to them.
  3. Queries that a user does not have permission to run take longer to fail than they should.

Configure a catalog that not visible to a user. The following queries result in database calls showing the issues listed above:

USE hiddencatalog.schemaThatExists;
USE hiddencatalog.schemaThatDoesNotExist;
SELECT * FROM hiddencatalog.schemaThatExists.table;
SELECT * FROM hiddencatalog.schemaThatDoesNotExist.table;

This is not an exhaustive list.


Suggested fixes...

huw0 commented 4 months ago

I've raised a PR for the simplest of these changes in UseTask. The changes for StatementAnalyzer are quite extensive, so I'd prefer that this is looked at by someone who is more familiar with the Trino internals.

hashhar commented 4 months ago

cc: @dain @kokosing

Where a user does have access to the underlying database, but shouldn't have access in Trino then the error returned is that the table does not exist, rather than access is denied. This means a user can determine database objects that exist but are not visible to them.

I did not understand this. Doesn't throwing "table does not exist" actually prevent someone from enumerating which objects exist which he simply doesn't have access to? If the error was "access is denied" wouldn't they know the existence of the object?

I don't know how @dain considers "leaking existence or absence of object" from security standpoint though.