hashgraph / hedera-sdk-rust

The Hedera™ Rust SDK
Apache License 2.0
42 stars 14 forks source link

fix: Restore transient error detection for canceled hyper requests #840

Closed iamjpotts closed 2 months ago

iamjpotts commented 2 months ago

Description:

The crate used to, but now fails to, detect gRPC requests that failed with a tonic error wrapping a hyper error where the hyper error was in a "cancelled" state.

This detection mechanism uses downcast to inspect a source (inner) error, and broke when hyper was upgraded from 0.14.x to 1.x on 4/26 in https://github.com/hashgraph/hedera-sdk-rust/commit/2445034b12e4d5b4a5aff486a7856e166221bbb1.

cc @RickyLB / @mehcode

tonic 0.11, the current dependency, and tonic 0.9, the previous dependency before that commit, both depend on hyper 0.14, and the downcast to a hyper 1.x Error type will not return the hyper 0.14.x Error type wrapped by tonic.

This PR modifies is_hyper_canceled to check for the Error type from both versions of hyper.

Alternatives Considered

Fixes #

Notes for reviewer:

This fixes the backoff/retry mechanism in the context of connection failures, such as:

Execution of hedera::ping_query::PingQuery on node at index 4 / node id 0.0.7 failed due to Permanent(GrpcStatus(Status { code: Unknown, message: "transport error", source: Some(tonic::transport::Error(Transport, hyper::Error(Canceled, "connection was not ready"))) }))

That error should be retried, but is not, because the hyper error in a canceled state is not detected.

Logs after this PR is applied (and before the broken hyper upgrade linked above was applied):

2024-09-08T20:51:48.375888Z DEBUG hedera::execute: Preparing hedera::query::Query<hedera::account::account_balance_query::AccountBalanceQueryData> on node at index 3 / node id 0.0.6    
2024-09-08T20:51:48.375938Z DEBUG hedera::execute: Executing hedera::query::Query<hedera::account::account_balance_query::AccountBalanceQueryData> on node at index 3 / node id 0.0.6    
2024-09-08T20:51:48.377356Z  WARN hedera::execute: Execution of hedera::query::Query<hedera::account::account_balance_query::AccountBalanceQueryData> on node at index 3 / node id 0.0.6 will continue due to GrpcStatus(Status { code: Unknown, message: "transport error", source: Some(tonic::transport::Error(Transport, hyper::Error(Canceled, "connection closed"))) })    
2024-09-08T20:51:48.377530Z DEBUG hedera::execute: Preparing hedera::query::Query<hedera::account::account_balance_query::AccountBalanceQueryData> on node at index 1 / node id 0.0.4    
2024-09-08T20:51:48.377571Z DEBUG hedera::execute: Executing hedera::query::Query<hedera::account::account_balance_query::AccountBalanceQueryData> on node at index 1 / node id 0.0.4    
2024-09-08T20:51:48.463582Z DEBUG hedera::execute: Execution of hedera::query::Query<hedera::account::account_balance_query::AccountBalanceQueryData> on node at index 1 / node id 0.0.4 succeeded

Checklist