Closed BusyJay closed 2 years ago
Long time tests show that https://github.com/tikv/tikv/issues/12202 and https://github.com/tikv/tikv/issues/12198 are resolved by this patch.
However, it's unsafe to do so as the slice is not guaranteed to be accessible in the first place.
can you explain that?
can you explain that?
The order of handling batch result and destroying a call is uncertain. So it's possible that when the batch result is returned from the queue_next, the corresponding grpc call is already destroyed. In that case, referencing internal slices is UAF, and destroy the metadata later will dereference it, which is a double free.
After metadata is received, every entry should be owned by the grpc c core, application should clone the byte slice if necessary.
In the past, we get around the problem by increasing the reference count of the slice. However, it's unsafe to do so as the slice is not guaranteed to be accessible in the first place.
This PR fixes the problem by introducing an unowned metadata type. It works just like Metadata, but accessing its content requires manual check for the lifetime of associated call.
I tried to add test case to cover the unsafe access, however it's too racy and nearly impossible to introduce a case without touching the C core.