Open dain opened 2 years ago
I have a prototype if the basic stuff working, and an working on cleaning this up for submission.
Hi, I use this PR to test DynamicCatalog, However, I ran into a problem when I deleted and then created a Hive Catalog. like reload, the first time is OK and can execute SQL, the second time will have the following error occurred when i execute SQL. java.lang.ClassCastException: class io.trino.plugin.hive.HiveTableHandle cannot be cast to class io.trino.plugin.hive.HiveTableHandle (io.trino.plugin.hive.HiveTableHandle is in unnamed module of loader io.trino.server.PluginClassLoader @408b024e; io.trino.plugin.hive.HiveTableHandle is in unnamed module of loader io.trino.server.PluginClassLoader @aa6aca3) at io.trino.plugin.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:150) at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:49) at io.trino.split.PageSourceManager.createPageSource(PageSourceManager.java:62) at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:308) at io.trino.operator.Driver.processInternal(Driver.java:411) at io.trino.operator.Driver.lambda$process$10(Driver.java:314) at io.trino.operator.Driver.tryWithLock(Driver.java:706) at io.trino.operator.Driver.process(Driver.java:306) at io.trino.operator.Driver.processForDuration(Driver.java:277) at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:739) at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:164) at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:515) at io.trino.$gen.Trino_400_100_gc6901d5_dirty____20221025_085036_2.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833)
so i take two days to read trino code, i found this commit https://github.com/trinodb/trino/pull/1820, i think @dain will face this problem in worker too, because it will generate a new classloader when you update hive catalog, but io.trino.plugin.hive.HiveTableHandle's instance loaded last time
Hi, @dain , @Unclecc .
I am also interested in Dynamic Catalog, so I am trying to implement my own based on the PR. The above issue that @Unclecc mentioned was reproduced in the same way, but I thought that the issue you mentioned is not the problem.
I solved the problem by slightly modifying AbstractTypedJacksonModule.InternalTypeDeserializer . Since the AsPropertyTypeDeserializer object caches the class loader for the class parsed once, the second call process does not use the class loader changed during the Dynamic Catalog application process, but the existing class loader is used. Since it was for simple test purposes, I changed the method to create an AsPropertyTypeDeserializer object every time as shown below, and confirmed that it works.
private static class InternalTypeDeserializer<T>
extends StdDeserializer<T>
{
private TypeIdResolver typeIdResolver;
private Class<T> baseClass;
public InternalTypeDeserializer(Class<T> baseClass, TypeIdResolver typeIdResolver)
{
super(baseClass);
this.typeIdResolver = typeIdResolver;
this.baseClass = baseClass;
}
@SuppressWarnings("unchecked")
@Override
public T deserialize(JsonParser jsonParser, DeserializationContext deserializationContext)
throws IOException
{
TypeDeserializer typeDeserializer = new AsPropertyTypeDeserializer(
TypeFactory.defaultInstance().constructType(baseClass),
typeIdResolver,
TYPE_PROPERTY,
false,
null);
return (T) typeDeserializer.deserializeTypedFromAny(jsonParser, deserializationContext);
}
}
@Unclecc , If you apply the above code and test it, I think it will work. @dain , If you have the same problem, I hope the above solution helps.
I'm not good at English... Please understand if the context is strange.
Hi, I am also interested in Dynamic Catalog.When will this feature be available?
Hey @Gqyanxin and anyone else that stumbles upon this issue. Keep in mind that this is an open-source project and some of this work has to happen during the developers' free time.
The best you can do is follow this issue and the related to get a sense of how close they are.
To get a sense of why this issue may take some time, see the discussion on this during the Trino Summit. If Dain has an update on when to expect this, he will provide them here.
Hi,
At ForePaaS, we are currently testing this new feature, which is proving to be very helpful for our use case. I have start some work on a JdbcCatalogStore inspirated by the FileCatalogStore. If this is something that is relevant to the main Trino open source project, I will create a merge request as soon as possible to discuss the subject with you. Is that something referring to you follow-up items : SPI for catalog storage ?
My goal is to have a dynamic catalog, with persistant data, but without directly using disk. Thanks,
@dain ^^
What's grammar now, I want to have a test for it.
@dain Any update on when the pending items for MVP will be done? We have been eagerly waiting for this to be finished for a usable dynamic catalog capability. Pl let us know
There is related work ongoing in https://github.com/trinodb/trino/issues/15921 - without that while you can have dynamic catalogs one of the most important connectors - Hive (and Iceberg) won't be able to be used with it.
i need this feature,it can be used?
It's useful to implementation k8s CRD based catalogs. And further more is Trino k8s operator.
Hey, did you guys figure out the dynamic catalog? I am trying to make an API just for that, but I am missing the specific user permission that will allow me to do so.
@dain A question, the CatalogVersion
javadoc says:
/**
* Version of a catalog. The string maybe compared lexicographically using ASCII, and to determine which catalog version is newer.
*/
Right now none of the implementations actually adhere to this though and nor do we use CatalogVersion#compareTo
anywhere.
e.g. in FileCatalogStore#computeCatalogVersion
we break the contract by returning a length-prefixed hash of catalog name, connector name and properties. We also have a javadoc there which says:
/**
* This is not a generic, universal, or stable version computation, and can and will change from version to version without warning.
* For places that need a long term stable version, do not use this code.
*/
This is interesting because the CatalogStore
interface always returns the "latest" version of the catalog (because concurrency is being limited in CatalogDynamicCatalogManager
usign the catalogsUpdateLock
which means the CatalogVersion
is of no use for doing things like optimistic concurrency control in a database-backed catalog store for example.
This is confusing. Can you please clarify whether CatalogVersion
is something which the CatalogStore
implementation should care about? If not maybe we should move that computation to the CatalogManager
.
Also lot of SPIs around pluggable catalog stores are marked @Experimental
at the moment. Do you think it's time to promote them to be stable?
您的邮件已经收到。我会尽快处理。
In an offline discussion we realised that the CatalogVersion
contract doesn't need to define lexicographic ordering because in current implementation the engine or catalog store never needs a sorting of catalog versions - only the latest catalog version and using the catalog version to disambiguate the catalog handles for the same connector with same catalog name.
I'll send a PR with the javadoc updates and some code cleanup.
This is an umbrella tracking issue for the Dynamic Catalogs project. As we progress on this project, PRs and sub-issues will be linked back to this issue.
Goal
Add support for adding and removing catalogs without restarting servers or interrupting existing queries. This will be dependent on the connector being updated to support concurrent versions with same name (e.g., no JMX names containing just a catalog name). For non-compliant connectors, support a basic model where queries are killed when dropping.
For catalog properties the current plan is to simply use the existing
Map<String, String>
of arbitrary properties instead of requiring pre-declaration of typed catalog properties like is done for session and table properties. This is mainly for expedience and backwards compatibility of the existing connector configuration system.MVP Plan
Follow-up Items