apache / gravitino

World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
https://gravitino.apache.org
Apache License 2.0
908 stars 292 forks source link

[Bug report] Create topic encounter NoSuchTopicException in the environment which Kafka was deployed with 3 brokers on EKS #4168

Closed danhuawang closed 1 month ago

danhuawang commented 1 month ago

Version

main branch

Describe what's wrong

curl -X POST -H "Accept: application/vnd.gravitino.v1+json" -H "Content-Type: application/json" -d '{
"name": "example_topic_99",
"comment": "This is an example topic 99",
"properties": {
"partition-count": "3",
"replication-factor": 1
}}' http://ae68c1d2b596b4d0ebf2ea7902c0e2e7-239867120.ap-northeast-1.elb.amazonaws.com:8090/api/metalakes/test/catalogs/k2/schemas/default/topics

{"code":1003,"type":"NoSuchTopicException","message":"Failed to operate topic(s) [example_topic_99] operation [CREATE] under schema [default], reason [Topic test.k2.default.example_topic_99 does not exist]","stack":["com.datastrato.gravitino.exceptions.NoSuchTopicException: Topic test.k2.default.example_topic_99 does not exist","\tat com.datastrato.gravitino.catalog.kafka.KafkaCatalogOperations.loadTopic(KafkaCatalogOperations.java:210)","\tat com.datastrato.gravitino.catalog.TopicOperationDispatcher.lambda$createTopic$8(TopicOperationDispatcher.java:155)","\tat com.datastrato.gravitino.catalog.CatalogManager$CatalogWrapper.lambda$doWithTopicOps$3(CatalogManager.java:148)","\tat com.datastrato.gravitino.utils.IsolatedClassLoader.withClassLoader(IsolatedClassLoader.java:86)","\tat com.datastrato.gravitino.catalog.CatalogManager$CatalogWrapper.doWithTopicOps(CatalogManager.java:143)","\tat com.datastrato.gravitino.catalog.TopicOperationDispatcher.lambda$createTopic$9(TopicOperationDispatcher.java:155)","\tat com.datastrato.gravitino.catalog.OperationDispatcher.doWithCatalog(OperationDispatcher.java:115)","\tat com.datastrato.gravitino.catalog.TopicOperationDispatcher.createTopic(TopicOperationDispatcher.java:153)","\tat com.datastrato.gravitino.catalog.TopicNormalizeDispatcher.createTopic(TopicNormalizeDispatcher.java:70)","\tat com.datastrato.gravitino.listener.TopicEventDispatcher.createTopic(TopicEventDispatcher.java:132)","\tat com.datastrato.gravitino.server.web.rest.TopicOperations.lambda$createTopic$2(TopicOperations.java:126)","\tat com.datastrato.gravitino.lock.TreeLockUtils.doWithTreeLock(TreeLockUtils.java:49)","\tat com.datastrato.gravitino.server.web.rest.TopicOperations.lambda$createTopic$3(TopicOperations.java:122)","\tat java.security.AccessController.doPrivileged(Native Method)","\tat javax.security.auth.Subject.doAs(Subject.java:422)","\tat com.datastrato.gravitino.utils.PrincipalUtils.doAs(PrincipalUtils.java:39)","\tat com.datastrato.gravitino.server.web.Utils.doAs(Utils.java:135)","\tat com.datastrato.gravitino.server.web.rest.TopicOperations.createTopic(TopicOperations.java:108)","\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)","\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)","\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)","\tat java.lang.reflect.Method.invoke(Method.java:498)","\tat org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)","\tat org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:146)","\tat org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:189)","\tat org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:176)","\tat org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:93)","\tat org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:478)","\tat org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:400)","\tat org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81)","\tat org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:256)","\tat org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)","\tat org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)","\tat org.glassfish.jersey.internal.Errors.process(Errors.java:292)","\tat org.glassfish.jersey.internal.Errors.process(Errors.java:274)","\tat org.glassfish.jersey.internal.Errors.process(Errors.java:244)","\tat org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)","\tat org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:235)","\tat org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:684)","\tat org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:394)","\tat org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:346)","\tat org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:358)","\tat org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:311)","\tat org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205)","\tat org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)","\tat org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1656)","\tat com.datastrato.gravitino.server.authentication.AuthenticationFilter.doFilter(AuthenticationFilter.java:73)","\tat org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)","\tat org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)","\tat com.datastrato.gravitino.server.web.VersioningFilter.doFilter(VersioningFilter.java:111)","\tat org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)","\tat org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1626)","\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:552)","\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)","\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:600)","\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)","\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)","\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)","\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)","\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)","\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)","\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:505)","\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)","\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)","\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)","\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)","\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)","\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)","\tat org.eclipse.jetty.server.Server.handle(Server.java:516)","\tat org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487)","\tat org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732)","\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479)","\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)","\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)","\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)","\tat org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)","\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)","\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)","\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)","\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)","\tat org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409)","\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)","\tat org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)","\tat java.lang.Thread.run(Thread.java:750)","Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition.","\tat java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)","\tat java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)","\tat org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165)","\tat com.datastrato.gravitino.catalog.kafka.KafkaCatalogOperations.loadTopic(KafkaCatalogOperations.java:199)","\t... 83 more","Caused by: org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition."]}

Error message and/or stacktrace

Logs in gravitino:

2024-07-16 07:44:12.911 INFO [Gravitino-webserver-43] [com.datastrato.gravitino.server.web.rest.TopicOperations.createTopic(TopicOperations.java:106)] - Received create topic request: test.k2.default
2024-07-16 07:44:12.913 INFO [Gravitino-webserver-43] [com.datastrato.gravitino.server.web.rest.TopicOperations.lambda$createTopic$3(TopicOperations.java:111)] - Creating topic under schema: test.k2.default.example_topic_99
2024-07-16 07:44:12.916 INFO [kafka-admin-client-thread | gravitino.v1.uid3781668427208118404-test.k2] [org.apache.kafka.clients.NetworkClient.handleDisconnections(NetworkClient.java:937)] - [AdminClient clientId=gravitino.v1.uid3781668427208118404-test.k2] Node 1 disconnected.
2024-07-16 07:44:12.954 INFO [Gravitino-webserver-43] [com.datastrato.gravitino.catalog.kafka.KafkaCatalogOperations.createTopic(KafkaCatalogOperations.java:244)] - Created topic test.k2.default.example_topic_99[id: 3TmoDlEzTT-CX0YnPZpqPg] with 3 partitions and replication factor 1
2024-07-16 07:44:12.954 WARN [Gravitino-webserver-43] [com.datastrato.gravitino.StringIdentifier.newPropertiesWithId(StringIdentifier.java:114)] - Property gravitino.identifier:gravitino.v1.uid4201916465216404837 already existed in the properties, this is unexpected, we will ignore adding the identifier to the properties
2024-07-16 07:44:12.959 WARN [Gravitino-webserver-43] [com.datastrato.gravitino.server.web.rest.ExceptionHandlers$TopicExceptionHandler.handle(ExceptionHandlers.java:424)] - Failed to operate topic(s) [example_topic_99] operation [CREATE] under schema [default], reason [Topic test.k2.default.example_topic_99 does not exist]
com.datastrato.gravitino.exceptions.NoSuchTopicException: Topic test.k2.default.example_topic_99 does not exist
        at com.datastrato.gravitino.catalog.kafka.KafkaCatalogOperations.loadTopic(KafkaCatalogOperations.java:210) ~[?:?]
        at com.datastrato.gravitino.catalog.TopicOperationDispatcher.lambda$createTopic$8(TopicOperationDispatcher.java:155) ~[gravitino-core-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.catalog.CatalogManager$CatalogWrapper.lambda$doWithTopicOps$3(CatalogManager.java:148) ~[gravitino-core-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.utils.IsolatedClassLoader.withClassLoader(IsolatedClassLoader.java:86) ~[gravitino-core-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.catalog.CatalogManager$CatalogWrapper.doWithTopicOps(CatalogManager.java:143) ~[gravitino-core-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.catalog.TopicOperationDispatcher.lambda$createTopic$9(TopicOperationDispatcher.java:155) ~[gravitino-core-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.catalog.OperationDispatcher.doWithCatalog(OperationDispatcher.java:115) ~[gravitino-core-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.catalog.TopicOperationDispatcher.createTopic(TopicOperationDispatcher.java:153) ~[gravitino-core-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.catalog.TopicNormalizeDispatcher.createTopic(TopicNormalizeDispatcher.java:70) ~[gravitino-core-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.listener.TopicEventDispatcher.createTopic(TopicEventDispatcher.java:132) ~[gravitino-core-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.server.web.rest.TopicOperations.lambda$createTopic$2(TopicOperations.java:126) ~[gravitino-server-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.lock.TreeLockUtils.doWithTreeLock(TreeLockUtils.java:49) ~[gravitino-core-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.server.web.rest.TopicOperations.lambda$createTopic$3(TopicOperations.java:122) ~[gravitino-server-0.6.0-SNAPSHOT.jar:?]
        at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_342]
        at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_342]
        at com.datastrato.gravitino.utils.PrincipalUtils.doAs(PrincipalUtils.java:39) ~[gravitino-core-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.server.web.Utils.doAs(Utils.java:135) ~[gravitino-server-0.6.0-SNAPSHOT.jar:?]
        at com.datastrato.gravitino.server.web.rest.TopicOperations.createTopic(TopicOperations.java:108) ~[gravitino-server-0.6.0-SNAPSHOT.jar:?]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_342]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_342]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_342]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_342]
        at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52) ~[jersey-server-2.41.jar:?]
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:146) ~[jersey-server-2.41.jar:?]
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:189) ~[jersey-server-2.41.jar:?]
        at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:176) ~[jersey-server-2.41.jar:?]
        at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:93) ~[jersey-server-2.41.jar:?]
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:478) ~[jersey-server-2.41.jar:?]
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:400) ~[jersey-server-2.41.jar:?]
        at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81) ~[jersey-server-2.41.jar:?]
        at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:256) ~[jersey-server-2.41.jar:?]
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248) ~[jersey-common-2.41.jar:?]
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244) ~[jersey-common-2.41.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:292) ~[jersey-common-2.41.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:274) ~[jersey-common-2.41.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:244) ~[jersey-common-2.41.jar:?]
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265) ~[jersey-common-2.41.jar:?]
        at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:235) ~[jersey-server-2.41.jar:?]
        at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:684) ~[jersey-server-2.41.jar:?]
        at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:394) ~[jersey-container-servlet-core-2.41.jar:?]
        at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:346) ~[jersey-container-servlet-core-2.41.jar:?]
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:358) ~[jersey-container-servlet-core-2.41.jar:?]
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:311) ~[jersey-container-servlet-core-2.41.jar:?]
        at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205) ~[jersey-container-servlet-core-2.41.jar:?]
        at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799) ~[jetty-servlet-9.4.51.v20230217.jar:9.4.51.v20230217]
        at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1656) ~[jetty-servlet-9.4.51.v20230217.jar:9.4.51.v20230217]
        at com.datastrato.gravitino.server.authentication.AuthenticationFilter.doFilter(AuthenticationFilter.java:73) ~[gravitino-server-common-0.6.0-SNAPSHOT.jar:?]

How to reproduce

  1. Helm install Kafka on K8s cluster
  2. Create messaging catalog as following: image
  3. Create topic in default schema
curl -X POST -H "Accept: application/vnd.gravitino.v1+json" -H "Content-Type: application/json" -d '{
"name": "example_topic_99",
"comment": "This is an example topic 99",
"properties": {
"partition-count": "3",
"replication-factor": 1
}}' http://ae68c1d2b596b4d0ebf2ea7902c0e2e7-239867120.ap-northeast-1.elb.amazonaws.com:8090/api/metalakes/test/catalogs/k2/schemas/default/topics

Additional context

No response

mchades commented 1 month ago

It's same with #3496

mchades commented 1 month ago

I'm going to close this issue as it duplicates the #3496