open-telemetry / opentelemetry-proto

OpenTelemetry protocol (OTLP) specification and Protobuf definitions
https://opentelemetry.io/docs/specs/otlp/
Apache License 2.0
601 stars 259 forks source link

Clarify whether OTLP/gRPC codes should cover both client and server #506

Open carlosalberto opened 1 year ago

carlosalberto commented 1 year ago

From https://github.com/open-telemetry/opentelemetry-specification/pull/3653:

This seems confusing, the current spec says "The server MAY use other gRPC codes to indicate retryable and not-retryable errors if those other gRPC codes are more appropriate for a particular erroneous situation. The client SHOULD interpret gRPC status codes as retryable or not-retryable according to the following table..." which indicates that the gRPC status codes are only coming from the server. Should the transient error here cover both client and server error?

and

For gRPC this is not always true. See https://grpc.github.io/grpc/core/md_doc_statuscodes.html... A common example where an RPC returns a status which was not set by the server is the Unavailable status code. For instance, if you make an rpc call to some server bogusurl.com the call results in an Unavailable status code.

See https://github.com/open-telemetry/opentelemetry-proto/blob/main/docs/specification.md#failures

tigrannajaryan commented 1 year ago

Yes, if the gRPC client can't make the the call and returns Unavailable status code the exporter is supposed to interpret it according to the table. Generally, the exporter does not need to know who is responsible for returned status code, whether it is coming from the server or because the client can't connect to the server.