spring-attic / spring-cloud-gcp

Integration for Google Cloud Platform APIs with Spring
Apache License 2.0
704 stars 694 forks source link

[PubSub] PubSubInboundChannelAdapter stop pulling messages from PubSub #2552

Open brachipa opened 3 years ago

brachipa commented 3 years ago

I have SpringBoot App deployed in GKE pod, that pulls messages from pubsub:

 PubSubInboundChannelAdapter adapter = new PubSubInboundChannelAdapter(pubSubTemplate, subscription);
 logger.info("Created subscription to " + subscription);    
 adapter.setOutputChannel(inputChannel);    
 adapter.setAckMode(AckMode.MANUAL);

It works fine, but then it stopped pulled messages, if I restart the pod it works fine. Nothing in the logs. I have many pods that subscribed to different subscriptions and this issue happens for all of them. can be once in a month, but consistently happens, and require us to restart the pods all the time (production cluster)

spring boot: 2.3.0.RELEASE spring-cloud-gcp-starter-pubsub 1.2.5.RELEASE

elefeint commented 3 years ago

@brachipa When the listener stops, is there an unusually long amount of time that passed from when the last message was sent to the topic?

Do you override spring.cloud.gcp.pubsub.keepAliveIntervalMinutes property to any custom value?

Do you override any of the Spring Cloud GCP pub/sub subscription beans -- subscriberTransportChannelProvider, SubscriberFactory, PubSubSubscriberTemplate?

brachipa commented 3 years ago

@elefeint I didn't override anything. Regarding the last message, it was around 1hour before..

meltsufin commented 3 years ago

@brachipa Do you have other pods that are getting the messages at the same time? Perhaps this node is just being starved? Since Spring Cloud GCP relies on the Pub/Sub client library, it's possible that there is an issue there. cc/ @chingor13

brachipa commented 3 years ago

Yes, I also forget to mention that I have 2 subscriptions in the same pod, one is getting message in the same time and one is totally disconnected.

meltsufin commented 3 years ago

I would try the Synchronous Pull method, which is less prone to starvation, and you have more control over.