hyperium / tonic

A native gRPC client & server implementation with async/await support.
https://docs.rs/tonic
MIT License
9.35k stars 957 forks source link

set canceled status code if the underlying hyper error was due to cancelation #1669

Closed nrxus closed 1 month ago

nrxus commented 3 months ago

Motivation

At work we use tonic to query CRI sockets about k8s information when available. This information is best effort only so we try to ignore "expected" errors while still logging errors we wouldn't expect to catch bugs. One of these expected errors is cancellation (e.g., due to a timeout) so we have code to check the status code. However, it appears that if the cancellation came about from a hyper envelope being dropped then the error is emitted as Unknown from tonic. Pretty printed the error looks like:

status: Unknown, message: "transport error", details: [], metadata: MetadataMap { headers: {} }: transport error: operation was canceled: connection closed: connection closed

It seems right now there is special code for hyper errors whose source is an h2::Error, but the errors in the envelope drop have as a source just a static string so it defaults back to the "unknown" error.

Solution

Thankfully hyper:Error has a is_canceled method that checks its inner kind.

My apologies for the lack of test but it appears that hyper:Error isn't instantiatable from outside the hyper crate so there wasn't an easy way to reproduce my specific kind of error in an unit test.

nrxus commented 3 months ago

@LucioFranco gentle ping in case any notifications about these got buried. Merging this and a new patch release would be super helpful

LucioFranco commented 3 months ago

I can try to get a patch release out tomorrow, but I need to check the main branch otherwise I am out till mid next week when I can spend more time on this.

LucioFranco commented 3 months ago

@nrxus seems like there is a CI issue

nrxus commented 3 months ago

@nrxus seems like there is a CI issue

@LucioFranco Hm weird, it doesn't seem to be related to this PR at all. It looks like the latest rust is just a little stricter around dead code for tuple structs. I added an #[allow(dead_code)] for the test struct that was causing issues.

nrxus commented 2 months ago

@LucioFranco another ping since I think it's all ready (:

nrxus commented 2 months ago

@LucioFranco I've rebased to latest master that fixed the clippy issue. I believe it's ready to merge but it needs you to hit approve again.

nrxus commented 1 month ago

@djc do you mind reviewing this?

djc commented 1 month ago

This makes sense to me, sorry for the long delays!