Closed dimovelev closed 9 months ago
FYI - locally I changed the two methods and do not get this exception anymore (I get a different one - something with header size too large but it seems to be a different issue):
@Override
public void stop() { // or just remove me
super.stop();
}
@Override
public void close() {
try {
super.close();
} finally {
if (client != null) {
client.close();
}
}
}
Is this happening all the time or occasionally? If the latter, how frequently? We changed a few things in 1.11.6 (and 1.12.0), would you mind checking if it resolves the issue?
@jonatan-ivanov this happens only when the application is being shut down - e.g. a new revision is being deployed and all old revision instances are being shut down. The bigger impact is actually that it also happens in cloudrun jobs when they finish and as a result the metrics are not pushed when they finish.
@jonatan-ivanov i changed the micrometer.version maven property in the reproducer to use a different micrometer version (and verified that it is being used) and it still fails both with 1.11.6 and with 1.12.0
Hey there, it seems we are facing the same issue. To get more ideas of what changes lead to the observed behaviour I will write this comment.
We started to face this issue when we migrated from Springboot 2.7 to 3.1. Our workaround was (still is) to pin Micrometer to version 1.10.6 (all afterwards - including 1.10.7) shows all the error messages you described above. However, as of Springboot 3.2 the (ugly) Micrometer version pinning isn't working anyway. Thats why I was googleing around and stranded here.
def micrometer_version_pin = '1.10.6'
dependencies {
// [...]
implementation "io.micrometer:micrometer-commons:${micrometer_version_pin}!!"
implementation "io.micrometer:micrometer-core:${micrometer_version_pin}!!"
implementation "io.micrometer:micrometer-observation:${micrometer_version_pin}!!"
implementation "io.micrometer:micrometer-registry-stackdriver:${micrometer_version_pin}!!"
// [...]
}
Our workaround was (still is) to pin Micrometer to version 1.10.6 (all afterwards - including 1.10.7) shows all the error messages you described above.
Thank you for the details. That sounds like it was https://github.com/micrometer-metrics/micrometer/issues/3759, included in 1.10.7, that caused this to start happening, which makes sense. The proposed fix sounds reasonable. I'd like to see if there is any way we can effectively test the behavior locally with a stub of the stackdriver metrics service. That could help us avoid reintroducing this issue in the future and make sure it stays fixed.
@shakuzen I could not find an emulator or anything remotely helpful to integration test it :(
So for StackdriverMeterRegistry
, wouldn't a change from
@Override
public void stop() {
if (client != null)
client.shutdownNow();
super.stop();
}
to
@Override
public void stop() {
if (client != null && !client.isShutdown())
client.shutdownNow();
super.stop();
}
fix the problem? I understand that we're closing an already closed client? We would need to check if the metrics are still being sent on stop. Still no luck with providing some stub for Stackdriver metrics server?
@marcingrzejszczak would that not mean that no metrics will be pushed upon shutdown? We do not want to get the exception, but we also want to push the metrics one last time upon shutdown.
From my understanding the suggested change would shut down the client only if it hasn't been already shut down but maybe I'm missing sth.
For me it looks like the most fundamental change is to move shutdown of the client from stop()
to close()
method. I don't know in which phase of application lifecycle the phases take place. But maybe this already does the trick.
Could folks affected by this issue please try with the latest snapshots to make sure the fix is working and not causing any other issues? Specifically, it's in 1.10.14-SNAPSHOT, 1.11.9-SNAPSHOT, and 1.12.3-SNAPSHOT available from the snapshot repo.
1.12.3-SNAPSHOT -> Fix seems to work properly, metrics are being published and client is shutdown gracefully afterwards:
2024-02-06T10:55:07.340+01:00 DEBUG 19503 --- [tp1973538135-41] i.m.s.StackdriverMeterRegistry : successfully sent 9 TimeSeries to Stackdriver
2024-02-06T10:55:07.341+01:00 DEBUG 19503 --- [tp1973538135-41] com.google.api.gax.grpc.ChannelPool : Initiating immediate shutdown due to explicit request
2024-02-06T10:55:07.347+01:00 DEBUG 19503 --- [tp1973538135-41] com.google.api.gax.grpc.ChannelPool : Initiating graceful shutdown due to explicit request
No other issue recognized so far.
I was a bit slow to respond - we have this fix already rolled out to multiple applications and it works. Thank you for merging it
Describe the bug Whenever an application is shut down, the stack driver registry tries to publish its metrics and throws an exception with the error UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: Channel shutdown invoked
Environment Google Cloud Run Spring Boot 3.1.5 micrometer-registry-stackdriver 1.11.5 spring-cloud-gcp 4.8.2
To Reproduce
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@2f35496c[Not completed, task = java.util.concurrent.Executors$RunnableAdapter@451b64f4[Wrapped task = TrustedListenableFutureTask@285303a0[status=PENDING, info=[task=[running=[NOT STARTED YET], com.google.api.gax.rpc.AttemptCallable@7fa60050]]]]] rejected from java.util.concurrent.ScheduledThreadPoolExecutor@7d04529c[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0] at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2081) ~[na:na] at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:841) ~[na:na] at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:340) ~[na:na] at java.base/java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:562) ~[na:na] at com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator.schedule(MoreExecutors.java:673) ~[guava-32.1.2-jre.jar:na] at com.google.api.gax.retrying.ScheduledRetryingExecutor.submit(ScheduledRetryingExecutor.java:116) ~[gax-2.37.0.jar:2.37.0] at com.google.api.gax.retrying.CallbackChainRetryingFuture$AttemptCompletionListener.handle(CallbackChainRetryingFuture.java:137) ~[gax-2.37.0.jar:2.37.0] at com.google.api.gax.retrying.CallbackChainRetryingFuture$AttemptCompletionListener.run(CallbackChainRetryingFuture.java:117) ~[gax-2.37.0.jar:2.37.0] at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31) ~[guava-32.1.2-jre.jar:na] at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1286) ~[guava-32.1.2-jre.jar:na] at com.google.common.util.concurrent.AbstractFuture.addListener(AbstractFuture.java:760) ~[guava-32.1.2-jre.jar:na] at com.google.api.core.AbstractApiFuture.addListener(AbstractApiFuture.java:45) ~[api-common-2.20.0.jar:2.20.0] at com.google.api.gax.retrying.CallbackChainRetryingFuture.setAttemptFuture(CallbackChainRetryingFuture.java:93) ~[gax-2.37.0.jar:2.37.0] at com.google.api.gax.rpc.AttemptCallable.call(AttemptCallable.java:88) ~[gax-2.37.0.jar:2.37.0] at com.google.api.gax.rpc.RetryingCallable.futureCall(RetryingCallable.java:63) ~[gax-2.37.0.jar:2.37.0] at com.google.api.gax.rpc.RetryingCallable.futureCall(RetryingCallable.java:41) ~[gax-2.37.0.jar:2.37.0] at com.google.api.gax.rpc.PagedCallable.futureCall(PagedCallable.java:62) ~[gax-2.37.0.jar:2.37.0] at com.google.api.gax.rpc.UnaryCallable$1.futureCall(UnaryCallable.java:126) ~[gax-2.37.0.jar:2.37.0] at com.google.api.gax.rpc.UnaryCallable.futureCall(UnaryCallable.java:87) ~[gax-2.37.0.jar:2.37.0] at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112) ~[gax-2.37.0.jar:2.37.0] at com.google.cloud.monitoring.v3.MetricServiceClient.listMetricDescriptors(MetricServiceClient.java:741) ~[google-cloud-monitoring-3.31.0.jar:3.31.0] at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.prePopulateVerifiedDescriptors(StackdriverMeterRegistry.java:429) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.createMetricDescriptorIfNecessary(StackdriverMeterRegistry.java:383) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.createTimeSeries(StackdriverMeterRegistry.java:358) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.createTimeSeries(StackdriverMeterRegistry.java:324) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.stackdriver.StackdriverMeterRegistry.createGauge(StackdriverMeterRegistry.java:227) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.stackdriver.StackdriverMeterRegistry.lambda$publish$1(StackdriverMeterRegistry.java:158) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.core.instrument.Meter.match(Meter.java:100) ~[micrometer-core-1.11.5.jar:1.11.5] at io.micrometer.stackdriver.StackdriverMeterRegistry.lambda$publish$10(StackdriverMeterRegistry.java:158) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:273) ~[na:na] at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1708) ~[na:na] at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[na:na] at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[na:na] at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) ~[na:na] at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:na] at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) ~[na:na] at io.micrometer.stackdriver.StackdriverMeterRegistry.publish(StackdriverMeterRegistry.java:163) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.core.instrument.step.StepMeterRegistry.close(StepMeterRegistry.java:146) ~[micrometer-core-1.11.5.jar:1.11.5] at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[na:na] at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[na:na] at org.springframework.beans.factory.support.DisposableBeanAdapter.invokeCustomDestroyMethod(DisposableBeanAdapter.java:303) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:238) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:587) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:559) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingleton(DefaultListableBeanFactory.java:1189) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:520) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingletons(DefaultListableBeanFactory.java:1182) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:1118) ~[spring-context-6.0.13.jar:6.0.13] at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1084) ~[spring-context-6.0.13.jar:6.0.13] at org.springframework.context.support.AbstractApplicationContext.close(AbstractApplicationContext.java:1035) ~[spring-context-6.0.13.jar:6.0.13] at org.springframework.boot.SpringApplicationShutdownHook.closeAndWait(SpringApplicationShutdownHook.java:145) ~[spring-boot-3.1.5.jar:3.1.5] at java.base/java.lang.Iterable.forEach(Iterable.java:75) ~[na:na] at org.springframework.boot.SpringApplicationShutdownHook.run(SpringApplicationShutdownHook.java:114) ~[spring-boot-3.1.5.jar:3.1.5] at java.base/java.lang.Thread.run(Thread.java:1583) ~[na:na] Suppressed: com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed at com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57) ~[gax-2.37.0.jar:2.37.0] ... 35 common frames omitted
2023-11-13T10:21:10.486+01:00 WARN 25564 --- [ionShutdownHook] i.m.s.StackdriverMeterRegistry : failed to create metric descriptor in Stackdriver for meter MeterId{name='jvm.buffer.memory.used', tags=[tag(id=direct)]}
com.google.api.gax.rpc.UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: Channel shutdown invoked at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:112) ~[gax-2.37.0.jar:2.37.0] at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:41) ~[gax-2.37.0.jar:2.37.0] at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:86) ~[gax-grpc-2.37.0.jar:2.37.0] at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:66) ~[gax-grpc-2.37.0.jar:2.37.0] at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97) ~[gax-grpc-2.37.0.jar:2.37.0] at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:84) ~[api-common-2.20.0.jar:2.20.0] at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1127) ~[guava-32.1.2-jre.jar:na] at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31) ~[guava-32.1.2-jre.jar:na] at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1286) ~[guava-32.1.2-jre.jar:na] at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1055) ~[guava-32.1.2-jre.jar:na] at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:807) ~[guava-32.1.2-jre.jar:na] at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:568) ~[grpc-stub-1.59.0.jar:1.59.0] at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:538) ~[grpc-stub-1.59.0.jar:1.59.0] at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) ~[grpc-api-1.59.0.jar:1.59.0] at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) ~[grpc-api-1.59.0.jar:1.59.0] at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) ~[grpc-api-1.59.0.jar:1.59.0] at com.google.api.gax.grpc.ChannelPool$ReleasingClientCall$1.onClose(ChannelPool.java:570) ~[gax-grpc-2.37.0.jar:2.37.0] at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:574) ~[grpc-core-1.59.0.jar:1.59.0] at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:72) ~[grpc-core-1.59.0.jar:1.59.0] at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:742) ~[grpc-core-1.59.0.jar:1.59.0] at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723) ~[grpc-core-1.59.0.jar:1.59.0] at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) ~[grpc-core-1.59.0.jar:1.59.0] at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) ~[grpc-core-1.59.0.jar:1.59.0] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[na:na] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[na:na] at java.base/java.lang.Thread.run(Thread.java:1583) ~[na:na] Suppressed: com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed at com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57) ~[gax-2.37.0.jar:2.37.0] at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112) ~[gax-2.37.0.jar:2.37.0] at com.google.cloud.monitoring.v3.MetricServiceClient.createMetricDescriptor(MetricServiceClient.java:1127) ~[google-cloud-monitoring-3.31.0.jar:3.31.0] at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.createMetricDescriptorIfNecessary(StackdriverMeterRegistry.java:405) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.createTimeSeries(StackdriverMeterRegistry.java:358) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.stackdriver.StackdriverMeterRegistry$Batch.createTimeSeries(StackdriverMeterRegistry.java:324) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.stackdriver.StackdriverMeterRegistry.createGauge(StackdriverMeterRegistry.java:227) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.stackdriver.StackdriverMeterRegistry.lambda$publish$1(StackdriverMeterRegistry.java:158) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.core.instrument.Meter.match(Meter.java:100) ~[micrometer-core-1.11.5.jar:1.11.5] at io.micrometer.stackdriver.StackdriverMeterRegistry.lambda$publish$10(StackdriverMeterRegistry.java:158) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:273) ~[na:na] at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1708) ~[na:na] at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[na:na] at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[na:na] at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) ~[na:na] at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:na] at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) ~[na:na] at io.micrometer.stackdriver.StackdriverMeterRegistry.publish(StackdriverMeterRegistry.java:163) ~[micrometer-registry-stackdriver-1.11.5.jar:1.11.5] at io.micrometer.core.instrument.step.StepMeterRegistry.close(StepMeterRegistry.java:146) ~[micrometer-core-1.11.5.jar:1.11.5] at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[na:na] at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[na:na] at org.springframework.beans.factory.support.DisposableBeanAdapter.invokeCustomDestroyMethod(DisposableBeanAdapter.java:303) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:238) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:587) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:559) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingleton(DefaultListableBeanFactory.java:1189) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:520) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingletons(DefaultListableBeanFactory.java:1182) ~[spring-beans-6.0.13.jar:6.0.13] at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:1118) ~[spring-context-6.0.13.jar:6.0.13] at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1084) ~[spring-context-6.0.13.jar:6.0.13] at org.springframework.context.support.AbstractApplicationContext.close(AbstractApplicationContext.java:1035) ~[spring-context-6.0.13.jar:6.0.13] at org.springframework.boot.SpringApplicationShutdownHook.closeAndWait(SpringApplicationShutdownHook.java:145) ~[spring-boot-3.1.5.jar:3.1.5] at java.base/java.lang.Iterable.forEach(Iterable.java:75) ~[na:na] at org.springframework.boot.SpringApplicationShutdownHook.run(SpringApplicationShutdownHook.java:114) ~[spring-boot-3.1.5.jar:3.1.5] ... 1 common frames omitted Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: Channel shutdown invoked at io.grpc.Status.asRuntimeException(Status.java:537) ~[grpc-api-1.59.0.jar:1.59.0] ... 14 common frames omitted