hyperledger-labs / fabric-operator

Hyperledger Fabric Kubernetes Operator
Apache License 2.0
66 stars 37 forks source link

Address connection issues with peers/orderers when deploying chaincodes on VPC clusters #126

Closed asararatnakar closed 1 year ago

asararatnakar commented 1 year ago

Problems found with contexts finishing before blocks retrieved can be caused by the keepalive interval being too long. This has been seen in K8S and OpenShift VPC clusters.

2023-04-25 14:48:52.900 UTC 00ed INFO [comm.grpc.server] 1 -> streaming call completed grpc.service=orderer.AtomicBroadcast grpc.method=Deliver grpc.peer_address=172.17.51.147:17507 error="context finished before block retrieved: context canceled" grpc.code=Unknown grpc.call_duration=50.003833785s
2023-04-25 14:52:03.259 UTC 00ee INFO [comm.grpc.server] 1 -> streaming call completed grpc.service=orderer.AtomicBroadcast grpc.method=Deliver grpc.peer_address=172.17.47.7:19993 error="context finished before block retrieved: context canceled" grpc.code=Unknown grpc.call_duration=50.002278835s

The root cause for this issue is the higher values assigned for keepalive settings for the peers and orderers. reducing the same in the tests shows there were no connection issues seen.

For orderer

    # Keepalive settings for the GRPC server.
    Keepalive:
        ServerMinInterval: 60s --> ## Changing default value to 25s solved some connnection issues

Similarly for Peer

    # Keepalive settings for peer server and clients
    keepalive:
        ...
        minInterval: 60s ## --> change this to 25s
        # Client keepalive settings for communicating with other peer nodes
        client:
            interval: 60s ## -->  --> change this to 30s
            ...
        # DeliveryClient keepalive settings for communication with ordering
        # nodes.
        deliveryClient:
            interval: 60s ## -->  --> change this to 30s