dapr / components-contrib

Community driven, reusable components for distributed apps
Apache License 2.0
548 stars 480 forks source link

Issue with Dapr 1.12.4 PubSub and KubeMQ #3343

Closed GFlisch closed 8 months ago

GFlisch commented 9 months ago

In what area(s)?

/area runtime on AKS

What version of Dapr?

1.12.4

Expected Behavior

Capability to Publish message (stability).

Actual Behavior

I am using Daprd 1.12.4 in a AKS cluster with KubeMQ as the messagerie platform.
The services are using Dapr to Publish and subscribe messages.

When I start the pod everything is going well.
After a while (unable to determine when) When I send a message via the PublishEventAsync method I have the following error:

Publish operation failed: the Dapr endpoint indicated a failure. See InnerException for details.

Dapr.DaprException: Publish operation failed: the Dapr endpoint indicated a failure. See InnerException for details. ---> Grpc.Core.RpcException: Status(StatusCode="Internal", Detail="error when publish to topic User in pubsub licmanager-contract-pubsub: kubemq pub/sub error: timeout waiting for response") at Dapr.Client.DaprClientGrpc.MakePublishRequest(String pubsubName, String topicName, ByteString content, Dictionary2 metadata, String dataContentType, CancellationToken cancellationToken) --- End of inner exception stack trace --- at Dapr.Client.DaprClientGrpc.MakePublishRequest(String pubsubName, String topicName, ByteString content, Dictionary2 metadata, String dataContentType, CancellationToken cancellationToken) at Dev4u.LicManager.Contract.Business.Logic.UserUpdatedNotification.HandleAsync(UserUpdated notification, CancellationToken cancellationToken) in /home/runner/work/Arc4u.Guidance.LicManager/Arc4u.Guidance.LicManager/src/BE/Contract/LicManager.Contract.Business/Logic/User/UserUpdatedNotification.cs:line 26 at Arc4u.Dispatcher.Notification.PublishersExtension.PublishForEachAsync[T](INotificationHandlers1 notifier, T param, CancellationToken cancellationToken) in /_/src/Arc4u.Standard.Dispatcher/Notification/PublishersExtension.cs:line 47 at Arc4u.Results.ResultExtension.OnSuccessAsync(Task1 result, Func1 action) in /_/src/Arc4u.Standard.Results/ResultExtensions.cs:line 63 at Dev4u.LicManager.Contract.Business.Logic.ResetUserLicenseKeyCommand.<>c__DisplayClass5_0.<<ResetLicenseKeyAsync>b__2>d.MoveNext() in /home/runner/work/Arc4u.Guidance.LicManager/Arc4u.Guidance.LicManager/src/BE/Contract/LicManager.Contract.Business/Logic/User/ResetUserLicenseKeyCommand.cs:line 30 --- End of stack trace from previous location --- at Arc4u.Results.ResultExtension.OnSuccessAsync[TValue](ValueTask1 result, Func2 action) in /_/src/Arc4u.Standard.Results/ResultExtensions.cs:line 94 at Dev4u.LicManager.Contract.Business.Logic.ResetUserLicenseKeyCommand.<>c__DisplayClass5_0.<<ResetLicenseKeyAsync>b__1>d.MoveNext() in /home/runner/work/Arc4u.Guidance.LicManager/Arc4u.Guidance.LicManager/src/BE/Contract/LicManager.Contract.Business/Logic/User/ResetUserLicenseKeyCommand.cs:line 26 --- End of stack trace from previous location --- at Arc4u.Results.ResultExtension.OnSuccessNotNullAsync[T](ValueTask1 result, Func2 func) in /_/src/Arc4u.Standard.Results/ResultExtensions.cs:line 129 at Dev4u.LicManager.Contract.Business.Logic.ResetUserLicenseKeyCommand.ResetLicenseKeyAsync(Guid userId, CancellationToken cancellationToken) in /home/runner/work/Arc4u.Guidance.LicManager/Arc4u.Guidance.LicManager/src/BE/Contract/LicManager.Contract.Business/Logic/User/ResetUserLicenseKeyCommand.cs:line 21 at Dev4u.LicManager.Contract.Facade.UserController.ResetLicenseKyAsync(IResetUserLicenseKeyCommand resetUserLicenseKey, Guid id, CancellationToken cancellation) in /home/runner/work/Arc4u.Guidance.LicManager/Arc4u.Guidance.LicManager/src/BE/Contract/LicManager.Contract.Facade/UserController.cs:line 100 at Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor.TaskOfIActionResultExecutor.Execute(ActionContext actionContext, IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeActionMethodAsync>g__Awaited|12_0(ControllerActionInvoker invoker, ValueTask1 actionResultValueTask) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.gAwaited|10_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.g__Awaited|13_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted) at Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.gAwaited|26_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, Object state, Boolean isCompleted)

When I have a look into the KubeMQ console log in Kubernetes I see those messages:

[kubemq-cluster-1] -> {"level":"ERROR","time":"2024-02-07T17:06:46.576Z","msg":"pipe - cid:753631 - maximum connections exceeded","host":"kubemq-cluster-1","module":"transport-server"} [kubemq-cluster-1] -> {"level":"ERROR","time":"2024-02-07T17:06:46.575Z","msg":"pipe - cid:753630 - maximum connections exceeded","host":"kubemq-cluster-1","module":"transport-server"} [kubemq-cluster-1] -> {"level":"ERROR","time":"2024-02-07T17:06:46.576Z","msg":"error on subscribe to events","host":"kubemq-cluster-1","module":"grpc","error":"connection closed"} [kubemq-cluster-1] -> {"level":"ERROR","time":"2024-02-07T17:06:46.576Z","msg":"pipe - cid:753632 - maximum connections exceeded","host":"kubemq-cluster-1","module":"transport-server"} [kubemq-cluster-1] -> {"level":"ERROR","time":"2024-02-07T17:06:46.576Z","msg":"pipe - cid:753633 - maximum connections exceeded","host":"kubemq-cluster-1","module":"transport-server"} ....

Steps to Reproduce the Problem

Define a yaml file to send messages to KubeMQ with Group defined and using store.

Publish some messages... The time to come to the issue is hours.

Release Note

RELEASE NOTE:

berndverst commented 9 months ago

Because the component is Beta it is not eligible for patch releases, so this issue is not likely to be resolved until Dapr 1.14.

Further, only the contributor of this component is in a good position to investigate what is going on here.

The maintainers will not likely be able to resolve this.

yaron2 commented 9 months ago

cc @kubemq

sicoyle commented 9 months ago

Even in PR builds right now we're seeing issues with the same components. See my open PR that is experiencing issues with pubsub mqtt3 and bindings rabbitmq: https://github.com/dapr/components-contrib/pull/3324 I copy/pasted the relevant err logs from the certification tests failing in my PR.

I also checked other PRs, and some other PRs are experiencing the same component certification tests failing. For example, see: https://github.com/dapr/components-contrib/pull/3337

GFlisch commented 8 months ago

Fixed with the release of 1.12.5.

kubemq commented 8 months ago

Great !

On Mon, Feb 26, 2024 at 10:49 AM GFlisch @.***> wrote:

Fixed with the release of 1.12.5.

— Reply to this email directly, view it on GitHub https://github.com/dapr/components-contrib/issues/3343#issuecomment-1963598366, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK5WGXFIZ6KUGYVDCRSW6XLYVRECNAVCNFSM6AAAAABC6YNKJ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRTGU4TQMZWGY . You are receiving this because you were mentioned.Message ID: @.***>