Open vburenin opened 3 days ago
After digging deeper, I found this parameter:
metastoreSupportsTableMeta
that is hardcoded as True.
protected ThriftMetastoreClient create(TransportSupplier transportSupplier, String hostname)
throws TTransportException
{
return new ThriftHiveMetastoreClient(
transportSupplier,
hostname,
catalogName,
metastoreSupportsDateStatistics,
true,
chosenGetTableAlternative,
chosenAlterTransactionalTableAlternative,
chosenAlterPartitionsAlternative);
}
Later on it is used in ThriftHiveMetastoreClient:
@Override
public List<TableMeta> getTableMeta(String databaseName)
throws TException
{
// TODO: remove this once Unity adds support for getTableMeta
if (!metastoreSupportsTableMeta) {
String catalogDatabaseName = prependCatalogToDbName(catalogName, databaseName);
Map<String, TableMeta> tables = new HashMap<>();
client.getTables(catalogDatabaseName, ".*").forEach(name -> tables.put(name, new TableMeta(databaseName, name, RelationType.TABLE.toString())));
client.getTablesByType(catalogDatabaseName, ".*", VIRTUAL_VIEW.name()).forEach(name -> {
TableMeta tableMeta = new TableMeta(databaseName, name, VIRTUAL_VIEW.name());
// This makes all views look like a Trino view, so that they are not filtered out during SHOW VIEWS
tableMeta.setComments(PRESTO_VIEW_COMMENT);
tables.put(name, tableMeta);
});
return ImmutableList.copyOf(tables.values());
}
if (databaseName.indexOf('*') >= 0 || databaseName.indexOf('|') >= 0) {
// in this case we replace any pipes with a glob and then filter the output
return client.getTableMeta(prependCatalogToDbName(catalogName, databaseName.replace('|', '*')), "*", ImmutableList.of()).stream()
.filter(tableMeta -> tableMeta.getDbName().equals(databaseName))
.collect(toImmutableList());
}
return client.getTableMeta(prependCatalogToDbName(catalogName, databaseName), "*", ImmutableList.of());
}
Once I changed that value to False, everything went back to normal.
I think that TODO is irrelevant and has to be given as a configuration option, otherwise large schemas become unusable.
I am finalizing migration from Trino 419 to Trino 464 and running into the issues of getting a large list of tables, closer to 200k in a single schema. My timeouts are set to 300s. Trino 419 is capable of returning result within couple seconds.
The problem appears to be a change in ThriftHiveMetastore that handles how tables metadata is received:
In trino 419 the method is called differently and also invokes a different method.