streamnative / pulsar-archived

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org
Apache License 2.0
73 stars 25 forks source link

ISSUE-12552: MaxRedirectException occurs when attempting to delete a subscription that was recently created and then had its topic destroyed #3200

Open sijie opened 2 years ago

sijie commented 2 years ago

Original Issue: apache/pulsar#12552


Describe the bug If I create a tenant/namespace/topic/subscription and then immediately (seconds later) delete and recreate the tenant/namespace/topic, attempting to delete the subscription fails with a 500 response code and a MaxRedirectException. I would expect subscription deletion to fail with a 404 as the subscription should not exist since I just deleted and recreated the tenant/namespace/topic.

To Reproduce This doesn't reproduce 100% of the time. It seems likely to happen more often when it is done on a brand new broker cluster that hasn't ever hosted topics/non-default tenants.

  1. Create a tenant.
  2. Create a namespace in that tenant
  3. Create a persistent, partitioned topic in that namespace.
  4. Attempt to delete a subscription in that topic with a given name, say "foo". DELETE should fail with a 404.
  5. Create that "foo" subscription in that topic and connect at least one consumer to that subscription (no messages need to be produced or consumed).
  6. Immediately thereafter, delete the topic/namespace/tenant.
  7. Do steps 1 through 4 again, with the same-named entities.
  8. Observe that step 4 sometimes (but not always) fails with a stacktrace like the one below.

Expected behavior

  1. Subscription deletion should fail with a 404 or a response that otherwise indicates "there is nothing to delete".
  2. This is probably more important: all operations related to deletion/creation of topics/namespaces/tenants conclude before the management API returns success codes to the user. This bug and the other similar ones I filed (see github links below) all seem to arise from CRUD operations with the management API being asynchronous: i.e. when I create/delete a tenant/topic/namespace, the actual side effects of that creation or deletion (e.g. adding/removing ledgers in BookKeeper, updating metadata in ZK) occur later, not during the API post. Not only is that bound to cause bugs like this, but it's also not what users expect; I would be happy to wait seconds or minutes for management API operations to complete in exchange for knowing that when they successfully complete that the thing I requested has actually been done.

Environment Same environment as https://github.com/apache/pulsar/issues/12551

Stacktrace

URL: DELETE http://pulsar-blt-sn-platform-proxy-headless.pulsar:8080/admin/v2/persistent/blt3/chariot_ns_test/chariot_topic_test/subscription/blt
Headers: Headers({'host': 'pulsar-blt-sn-platform-proxy-headless.pulsar:8080', 'accept': '*/*', 'accept-encoding': 'gzip, deflate', 'connection': 'keep-alive', 'user-agent': 'python-httpx/0.20.0'})
Payload: b''

 --- An unexpected error occurred in the server ---

Message: java.util.concurrent.CompletionException: org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$RetryException: Could not complete the operation. Number of retries has been exhausted. Failed reason: Maximum redirect reached: 5

Stacktrace:

org.apache.pulsar.client.admin.PulsarAdminException: java.util.concurrent.CompletionException: org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$RetryException: Could not complete the operation. Number of retries has been exhausted. Failed reason: Maximum redirect reached: 5
    at org.apache.pulsar.client.admin.internal.BaseResource.getApiException(BaseResource.java:247)
    at org.apache.pulsar.client.admin.internal.BaseResource$3.failed(BaseResource.java:184)
    at org.glassfish.jersey.client.JerseyInvocation$1.failed(JerseyInvocation.java:882)
    at org.glassfish.jersey.client.ClientRuntime.processFailure(ClientRuntime.java:247)
    at org.glassfish.jersey.client.ClientRuntime.processFailure(ClientRuntime.java:242)
    at org.glassfish.jersey.client.ClientRuntime.access$100(ClientRuntime.java:62)
    at org.glassfish.jersey.client.ClientRuntime$2.lambda$failure$1(ClientRuntime.java:178)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
    at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
    at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
    at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:288)
    at org.glassfish.jersey.client.ClientRuntime$2.failure(ClientRuntime.java:178)
    at org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector.lambda$apply$1(AsyncHttpConnector.java:204)
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
    at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
    at org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector.lambda$retryOperation$4(AsyncHttpConnector.java:263)
    at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
    at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
    at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
    at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
    at org.asynchttpclient.netty.NettyResponseFuture.abort(NettyResponseFuture.java:273)
    at org.asynchttpclient.netty.request.NettyRequestSender.abort(NettyRequestSender.java:473)
    at org.asynchttpclient.netty.handler.HttpHandler.readFailed(HttpHandler.java:161)
    at org.asynchttpclient.netty.handler.HttpHandler.handleRead(HttpHandler.java:154)
    at org.asynchttpclient.netty.handler.AsyncHttpClientHandler.channelRead(AsyncHttpClientHandler.java:78)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
    at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.CompletionException: org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$RetryException: Could not complete the operation. Number of retries has been exhausted. Failed reason: Maximum redirect reached: 5
    at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
    at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
    at java.base/java.util.concurrent.CompletableFuture$OrApply.tryFire(CompletableFuture.java:1503)
    ... 39 more
Caused by: org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector$RetryException: Could not complete the operation. Number of retries has been exhausted. Failed reason: Maximum redirect reached: 5
    at org.apache.pulsar.client.admin.internal.http.AsyncHttpConnector.lambda$retryOperation$4(AsyncHttpConnector.java:265)
    ... 36 more
Caused by: org.asynchttpclient.handler.MaxRedirectException: Maximum redirect reached: 5
github-actions[bot] commented 2 years ago

The issue had no activity for 30 days, mark with Stale label.