trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
9.94k stars 2.87k forks source link

Optimize SHOW TABLES #631

Open findepi opened 5 years ago

findepi commented 5 years ago

When using SHOW TABLES against Hive connector, the Metastore is queries twice. This is because

Since SHOW TABLES does not differentiate between tables and views, we could do this faster.

kokosing commented 5 years ago

Does it actually do a call to HMS? Are you sure that results are not taken from cache?

Praveen2112 commented 5 years ago

But if the cache is disabled it would fetch from the HMS right ?

kokosing commented 5 years ago

Transactional cache is always enabled. We do not query HMS twice for the same thing within the same transaction.

findepi commented 5 years ago

@kokosing there are 2 different calls to HMS. The one that we don't need is this one: https://github.com/prestosql/presto/blob/8003a11caaea40ce349ef33746c8537c35619122/presto-hive/src/main/java/io/prestosql/plugin/hive/metastore/thrift/ThriftHiveMetastore.java#L726-L727

findepi commented 5 years ago

Probably best fix would be to make information schema aware of selected columns. When table_type from information_schema.tables is not needed, InformationSchemaPageSourceProvider#buildTables wouldn't need to list views.

slinjhu commented 6 months ago

It takes forever when I do "SHOW TABLES" for ~6k tables. Any update on the issue?