googleapis / nodejs-pubsub

Node.js client for Google Cloud Pub/Sub: Ingest event streams from anywhere, at any scale, for simple, reliable, real-time stream analytics.
https://cloud.google.com/pubsub/
Apache License 2.0
516 stars 227 forks source link

PubSub with multiple Hosts #1892

Closed DannyyDing closed 3 months ago

DannyyDing commented 4 months ago

Hi Professionals, we use GraphQL subscription to manage PubSub subscriptions for message processing in Node.js. When we hit subscription in GraphQL, it should listen to any GraphQL events and show on front-end.

async subscribe() {
    const eventNames = ['myEvent'];
    return graphqlPubSub.asyncIterator(eventNames);
}

In support of this, we developed an npm package. In that package, we initialize Google PubSub:

this.subscription.on('message', this.processMessage);

In processMessage, we ack the message, publish GraphQL Message, and republish the message.

Since we are using a cluster with three hosts, we encounter an issue that the message will stuck in that host and never processed by other hosts again. Originally we thought there would be a 33% chance that either one of the three hosts gets the message. But we figured out that after a long time and many redelivery counts, it is still not statistically significant that each one of the hosts gets the message.

We tried nack the message. And it seems that the message is still stuck in one host.

The result is that the user (GraphQL) will either wait a long time to receive the message (like 100 redeliveries and then the second host gets the message, processes it and makes it available in GraphQL), or never receive the message (we set a max redelivery trials, like 128, and the message will not be republished again).

Is there a way to solve the issue and better handle the situation? Your help is greatly appreciated and we are looking forward to learning from your insight.

feywind commented 3 months ago

I think this might be something that you'd need to ask the product expert people, because the library doesn't have a ton of control over message routing to subscribers. I suppose Stack Overflow is another (free) way to discuss.

@kamalaboulhosn do you have any thoughts on this, or can you write where to redirect it?

kamalaboulhosn commented 3 months ago

So is the goal to have all hosts receive and process each message? If so, then the solution is a Pub/Sub subscription per host, not a single, shared subscription. Messages are load balanced across subscribers to the subscription and there are no guarantees about which host will receive the message once republished. If you want all subscribers to receive the messages, then the correct pattern is separate subscriptions.

feywind commented 3 months ago

Finishing up here, the thing I mentioned above is the Professional Services Org. It's not free but they can consult to develop and design apps.