Open williamdenton opened 1 year ago
maybe the simplest approach is to split this EnsureInitialized()
so when SendAsync()
calls it it doesn't configure the listening connection, just the channel.
Still not clear to me why (i assume) Initialize on the resilient connection did not complete but i can see from the telemetry we have it was receiving events. I make this assumption as thats the only way i can see that my event processor didn't finish and that telemetry shows postgres events for 12 hours (till we killed it). If it was a startup race condition maybe there is something at play in the Initialize() method of the AsyncTaskDispatcher being reentrant, but the logic looks ok with its semaphoreslim
https://github.com/ChilliCream/graphql-platform/blob/3f7f16001a275117117f24d4bd4891cb0a26cc9e/src/HotChocolate/Core/src/Subscriptions.Postgres/AsyncTaskDispatcher.cs#L32-L56 ... https://github.com/ChilliCream/graphql-platform/blob/3f7f16001a275117117f24d4bd4891cb0a26cc9e/src/HotChocolate/Core/src/Subscriptions.Postgres/AsyncTaskDispatcher.cs#L69-L81
Is there an existing issue for this?
Product
Hot Chocolate
Describe the bug
We run background jobs as well as api servers. We have recently rolled out HotChocolate Subscriptions using the Postgres Backplane.
A curious scenario that occurred over the weekend was a backend server processed an event that was published into the postgres backplane. This however triggered the subscription system to begin listening to events. The listening system is designed to run a task forever receiving events from the postgres connection notification event. This means that the task processing the event in the backend never finished as it had somehow triggered the listen command.
I'm not sure how this happened yet. The timing of this event does correlate with process startup so there may be some race condition that meant the subscription wasn't initialised fully before a message was published and so was initialised by its first use and that somehow meant it ended up owning the long running task.
In a previous issue/pr i have contributed code that allows the backend servers to publish messages into the backplane for consumption by the API/graphql servers. This means that
ITopicSender
is registered in DI, butITopicReceiver
is not.PostgresChannel
is used by with sending and receiving code paths and executeslisten
on all code paths. MaybePostgresChannel
needs to be split or allow to be configured so it doesn'tlisten
when in a "SendOnly" configuration as listening to events in a publish only scenario is pointless.Steps to reproduce
I'm sorry but i dont have a repro yet, but thought i would open an issue to start a discussion in case this sparked an idea of the cause for @PascalSenn
Relevant log output
No response
Additional Context?
No response
Version
13.8