Closed pjfanning closed 1 week ago
@kannanjgithub, the old Listener didn't require using the synchronization context. But the new stuff does. We probably messed up one of the adapting methods with onResult2 and didn't take that into account.
@pjfanning, NameResolvers can access the SynchronizationContext from [NameResolver.Args.getSynchronizationContext()
](https://grpc.github.io/grpc-java/javadoc/io/grpc/NameResolver.Args.html#getSynchronizationContext()). Long-term (getting nearer term), we will require NameResolvers to call the Listener from the SynchronizationContext, so that is a good change to make if you are willing.
@ejona86 Thanks for your quick response.
Apache Pekko is quite modular and pekko-grpc relies on other pekko modules including a pekko-discovery module. pekko-discovery has async APIs using Scala Futures. We won't be able to change this. The custom NameResolver in pekko-grpc uses the pekko-discovery APIs. In the short term, we will pin to grpc-java 1.67. In the long term, we will need to watch and see how grpc-java continues to evolve and whether we need to substantially rewrite pekko-grpc to uptake newer versions of grpc-java.
The problem was with the gRPC code NameResolver.Listener.onAddresses abstract base class implementation calling onResult2 outside of the synchronization context. In the first part of the changes for onResult2 it was calling onResult like it should but the later PR for ResolutionResult introduced this regression. This (and 2 other such callers) went uncaught because the check for call from the synchronization context is in ManagedChannelImpl's implementation of Listener2 but respective unit tests of the name resolvers use their own listeners. I have raised PR #11666 with the fix.
whether we need to substantially rewrite pekko-grpc to uptake newer versions of grpc-java.
@pjfanning, I don't see what substantial changes you are talking about. Somewhere between onComplete
and calling the listener call syncContext.execute
and run the remaining code inside. I don't know Scala, but it would be a one line change in Java.
We also encountered this issue when upgrading grpc-netty
to 1.68.1
from 1.68.0
. The issue was in one of our grpc stream services. It started throwing i.g.StatusRuntimeException: DEADLINE_EXCEEDED: Deadline CallOptions will be exceeded in 59.999822705s.
and Not called from the SynchronizationContext
exceptions.
Our system is in Scala 2.13. Reverting back to grpc-netty
1.68.0
fixed the problem.
@hacetin 1.68.0 has been marked as a mistaken release. I am switching to v1.67.1 in pekko-grpc.
Fixed by #11666
Similar to #10407 but has started affecting us with grpc-java 1.68.1
pekko-grpc PR: https://github.com/apache/pekko-grpc/pull/397
pekko-grpc is written in Scala and we are using Scala Futures when doing lookups asynchronously. grpc-java seems now to require that we use your SynchronizationContext instead.
I added an experimental change to pekko-grpc name resolution to add blocking code. This allowed me to avoid this issue but I discovered that we have some tests that still fail because io.grpc is unhappy that we are using Scala Futures. I wouldn't be delighted about the hack in our name resolver either.