googleapis / java-pubsub

Apache License 2.0
127 stars 90 forks source link

PubSub: Occasional / Intermittent java.net.UnknownHostException: pubsub.googleapis.com #2061

Closed salvatorenovelli closed 3 weeks ago

salvatorenovelli commented 3 months ago

I have a spring boot application running on (docker compose) GCP VM occasionally logging the exception below. This behaviour is also visible in my integration testing with a fail rate of around 0.9%.

The odd thing is that in this integration test I have multiple applications (containers run as testcontainers) connecting to pubsub and while one fails, the other seem to be able to subscribe, receive and publish messages.

But in the application where the exception is thrown, pubsub commands will hang and eventually timeout. eg: com.google.cloud.spring.pubsub.PubSubAdmin#getTopic

Environment details

This is a spring boot '3.2.5' application, running 'com.google.cloud:spring-cloud-gcp-starter-pubsub:5.4.1'. Java 17 on eclipse-temurin:17-jre-alpine

Steps to reproduce

During my integration testing, this error will manifest randomly just by launching the applicaion and attempting to subscribe to a topic. I can consistently reproduce this over (100 or so runs) by running the test continuosly (until failure).

The problem is that this is also consistently happening in production, over the past years. The exception is logged while the application is running. This follows the same dynamics: other applications seem to work fine but one will seemingly randomly throw this exception and stop working.

Stack trace

2024-06-05T08:19:30.235Z  WARN 1 --- [customer-repository] [ault-executor-2] io.grpc.internal.ManagedChannelImpl      : [Channel<5>: (pubsub.googleapis.com:443)] Failed to resolve name. status=Status{code=UNAVAILABLE, description=Unable to resolve host pubsub.googleapis.com, cause=java.lang.RuntimeException: java.net.UnknownHostException: pubsub.googleapis.com
    at io.grpc.internal.DnsNameResolver.resolveAddresses(DnsNameResolver.java:223)
    at io.grpc.internal.DnsNameResolver.doResolve(DnsNameResolver.java:282)
    at io.grpc.grpclb.GrpclbNameResolver.doResolve(GrpclbNameResolver.java:63)
    at io.grpc.internal.DnsNameResolver$Resolve.run(DnsNameResolver.java:318)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.net.UnknownHostException: pubsub.googleapis.com
    at java.base/java.net.InetAddress$CachedAddresses.get(Unknown Source)
    at java.base/java.net.InetAddress.getAllByName0(Unknown Source)
    at java.base/java.net.InetAddress.getAllByName(Unknown Source)
    at java.base/java.net.InetAddress.getAllByName(Unknown Source)
    at io.grpc.internal.DnsNameResolver$JdkAddressResolver.resolveAddress(DnsNameResolver.java:632)
    at io.grpc.internal.DnsNameResolver.resolveAddresses(DnsNameResolver.java:219)
    ... 6 more
}

Thanks for your support! happy to share more details as required

michaelpri10 commented 3 weeks ago

Hello! An UnknownHostException is not an error that would be caused by the client library, but rather would likely be related to the environment in which the application is running. One possibility is that DNS resolution is taking some time on application start-up, but to find the specific issue, I would recommend opening a support case.