Closed zirkome closed 7 years ago
@kokaz agreed that this is ultimately a GRPC issue in that the MD
type is not concurrency-safe and yet that's not documented anywhere :-/ (I just read the source).
Your solution probably shortens the critical path for the race condition, but there's still no reader lock guarding the Copy
call and thus it's not a true fix. Do you want to raise this with GRPC proper? I don't see a truly safe fix without changes to the MD
code.
I do not read into your code. Is it possible to have a mutex to guard your MD accesses? grpc MD is not concurrency-safe.
Regarding the document, I think a function is concurrency-safe iff it documents that explicitly. By default, it is not concurency-safe. For example, https://golang.org/src/container/list/list.go?s=922:955#L21.
To elaborate, i) md.Copy() is required anyways since the write op of metadataReaderWriter will race with grpc-internal read op otherwise; ii) if you want to put md a ReaderWriter, you have to use a mutex in read/write implementation -- it is just like you use a map as a ReaderWriter.
@iamqizhao
I do not read into your code. Is it possible to have a mutex to guard your MD accesses? grpc MD is not concurrency-safe.
Since the GRPC MD
is shared across all users of the GRPC connection, adding locking in this code (or any other) is not sufficient to make it concurrency-safe. We would need all readers and writers to share concurrency primitives. This is why I would argue that the MD
struct should be made thread-safe intrinsically... otherwise how can one confidently write code that either reads or writes from that structure?
@bensigelman The issue is that MD
is not a struct but a map
so anyone using MD
can end up using it non thread-safely and so we can't really make MD
intrinsically thread safe except for the method it has (i.e. Copy
, Join
).
Hi,
I didn't know where to file that probable bug but as it's related to gRPC I think that's a good place to start.
So I'm using this repo to bind Zipkin to gRPC but recently I ran into a concurrency issue involving
Inject
andgrpc.MD
.I'm doing two concurrent calls to 2 different gRPC service (also using this binding) but when ran multiple times this service crash with the following stack trace: https://gist.github.com/kokaz/1ff9a2e301abb34276dc002cf82009ae
Here is the repo containing the code: https://github.com/kokaz/zipkin-grpc-demo
After some investigation we might have found a solution: before creating
metadataReaderWriter
in both client and server interceptor, we reassignmd
with a copy of the one get from theContext
:cc @bensigelman