Open boneanxs opened 3 years ago
If i read correctly, the question is whether we should additionally have
client.shutdown();
in io.trino.plugin.hive.metastore.thrift.ThriftHiveMetastoreClient#close
.
Do you know what this would can change in practice for the metastore side? The connection is going to be closed anyway.
Yeah, I think we should call client.shutdown() explicitly in presto side. After client side call shutdown, HMS will shutdown rawStore(which remove PersistenceManager from PersistenceManagerFactory's cache), remove threadlocal variables.
You can find relate codes here: HMSHandler.java#L991, HMSHandler.java#L181
If we don't call shutdown, after HMSHandler is released, as PersistenceManager is not removed from cache, it would never be garbage collected, which could cause memory leak from HMS side(If the hive version is lower than 2.1). You can find a similar issue HIVE-14187(Client side not call MetaStoreClient#close, which calls shutdown to remove all cache)
Thanks for the pointers.
Seems we can call client.shutdown()
and we should ignore (probably logging) any errors resulting from it.
Is it possible that the call blocks though, or takes long? Especially for HMS >= 2.1, which doesn't need the call?
Thank you for raising your concerns.
After check the HMS codes, I don't think it could block the call, also, spark will always call shutdown when it disconnects with HMS.
I think there could be another issue we should take into consideration. We upgrade HMS version to 3.1 in our production environment, and found presto become slower when build connection with HMS. After traced some methods, it showed that HMS need to initialize rawStore when new connections coming(as lower version of HMS will not auto close rawStore, neither from presto side, so the rawStore could be cached into threadLocal and be reused).
Anyway, I think we should call client.shutdown from presto side first, and for the performance issue above, we can construct a connection pool with HMS from presto side, or raise a ticket for Hive, let them fix it.
I think there could be another issue we should take into consideration. We upgrade HMS version to 3.1 in our production environment, and found presto become slower when build connection with HMS. After traced some methods, it showed that HMS need to initialize rawStore when new connections coming(as lower version of HMS will not auto close rawStore, neither from presto side, so the rawStore could be cached into threadLocal and be reused).
cc @kokosing @electrum
@kokosing @electrum gentle ping...
We have a abstract layer, ThriftHiveMetastoreClient to wrap hive metastore client, but it seems not close hive metastore client correctly. In HiveMetaStoreClient, it first call shutdown to clean rawStore and some other threadlocal configures. can see here: https://github.com/apache/hive/blob/817ff27e96ecfce6c70c5850830e55a6e6f37da6/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java#L824 but we only call transport.close() to close port
can see: https://github.com/trinodb/trino/blob/fd878a6a71cb485acae9be4dbbb20c29147427a9/plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/thrift/ThriftHiveMetastoreClient.java#L107
Are there any other concerns we close client like this, or might just a mistake?