trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.49k stars 3.02k forks source link

system.jdbc Tables not following visibility rules #20864

Open brendanstennett opened 8 months ago

brendanstennett commented 8 months ago

We're noticing that once file-based access control rules are being set, the JDBC driver starts failing for Trino citing errors accessing jdbc system tables. I understand this is happening because as soon as file-based access control rules are being set, Trino goes from Allow All to Deny All mode by default and only gives access to tables that have rules associated with them.

As soon as you create a rule like the below JDBC starts to work again:

{
  "catalog": "system",
  "schema": "jdbc",
  "table": "(attributes|procedure_columns|procedures|pseudo_columns|super_tables|super_types|table_types|types|udts|catalogs|schemas|tables|columns)",
  "privileges": ["SELECT"]
}

However, this produces an information leak when creating a very permissive grant all for these tables because any user can now query these system tables to get information on which catalogs, schemas, tables, and columns exist in the entire connected ecosystem, regardless of the file-based access control rules being set.

The only way to get around this information leak would be to modify the above rule like so:

{
  "catalog": "system",
  "schema": "jdbc",
  "table": "(attributes|procedure_columns|procedures|pseudo_columns|super_tables|super_types|table_types|types|udts)",
  "privileges": [
    "SELECT"
  ]
},
{
  "catalog": "system",
  "schema": "jdbc",
  "table": "catalogs",
  "privileges": [
    "SELECT"
  ],
  "group": "some-group",
  "filter": "table_cat IN ('mycatalog')"
},
{
  "catalog": "system",
  "schema": "jdbc",
  "table": "schemas",
  "privileges": [
    "SELECT"
  ],
  "group": "some-group",
  "filter": "(table_catalog='mycatalog' AND table_schem IN ('public', ...))"
},
{
  "catalog": "system",
  "schema": "jdbc",
  "table": "tables",
  "privileges": [
    "SELECT"
  ],
  "group": "some-group",
  "filter": "(table_cat='mycatalog' AND table_schem='public' AND table_name IN ('mytable', ...))"
},
{
  "catalog": "system",
  "schema": "jdbc",
  "table": "columns",
  "privileges": [
    "SELECT"
  ],
  "group": "some-group",
  "filter": "(table_cat='mycatalog' AND table_schem='public' AND table_name='mytable' AND column_name IN ('col1','col2', ...))"
}

This is somewhat unexpected behaviour as it requires duplicate permission sets to be generated just to support JDBC. It's also problematic because the filter expressions for tables and columns could get unruly pretty quickly as the total number of tables and columns in the system grows.

My question is:

  1. Is this intended behaviour?
  2. Am I missing anything or have something incorrectly configured that requires the above to function?

Similar tickets: @ilsaloving Looks to be reporting a similar issues #17689 but the suggested fix was to add an empty schema stanza which looks to solve the issue reported there for general visibility but does not solve this jdbc issue.

ilsaloving commented 8 months ago

Actually, it doesn't solve the problem, as you've demonstrated. At this point I'm not sure I can even continue with trino, because I specifically wanted to use it to have a single method of permissions management instead of dealing with each individual DB.