Avoid need for sleeps after creating channel using API

bestbeforetoday commented 1 year ago

The end-to-end test suite currently contains arbitrary sleeps to give the network time to become stable before completing chaincode lifecycle deployment. These sleeps only appear to be required when the channel is created using the admin API. When the channel is created using the test-network scripts, chaincode deployment seems to work without sleeps.

Maybe it would be good to either:

Have the channel commands (perhaps optionally) check completion of configuration before returning so the system is in a fully usable state immediately after creation completes; or
Extend the channel API to allow client code to check for a stable system state.

bestbeforetoday commented 1 year ago

@channingduan @SamYuan1990 What do you think?

SamYuan1990 commented 1 year ago

In my point of view, the reason for sleep: as the some operator need a block to delivered to each peer as conformation. hence the sleep is added to ensure the next step will start after the previous block been completed.

do you want admin-sdk api listen the result after each operator? so ... this kind of checking will handle by admin-sdk itself, so for user, just need to focus on their operators?

any edge case been considered as timeout? replay attack?

bestbeforetoday commented 1 year ago

The chaincode API calls that submit transactions already waits for blocks to be committed before returning. This is handled by Fabric Gateway. Where channel API calls need to wait for blocks to commit, we could also make use of the Gateway API (or Gateway gRPC services directly if neccessary) to wait for commit of blocks, at least on the Gateway peer. Timeout can be controlled by the client-supplied Context.

Note that the Gateway only checks that transactions are committed to the local peer ledger. This is generally sufficient since it prefers highest block height peers for subsequent invocations. I'm not sure if this is sufficient for all the channel operations. Anything is better than having to guess an arbitrary delay time after performing operations.

It might be worth checking what the test-network scripts and/or existing CLI commands are doing.

SamYuan1990 commented 1 year ago

The chaincode API calls that submit transactions already waits for blocks to be committed before returning. This is handled by Fabric Gateway. Where channel API calls need to wait for blocks to commit, we could also make use of the Gateway API (or Gateway gRPC services directly if neccessary) to wait for commit of blocks, at least on the Gateway peer. Timeout can be controlled by the client-supplied Context.

Note that the Gateway only checks that transactions are committed to the local peer ledger. This is generally sufficient since it prefers highest block height peers for subsequent invocations. I'm not sure if this is sufficient for all the channel operations. Anything is better than having to guess an arbitrary delay time after performing operations.

It might be worth checking what the test-network scripts and/or existing CLI commands are doing.

I think it's better for gateway to handle network related case (for ex timeout). for admin sdk, if all network through gateway, we can remove sleep timeout.

SamYuan1990 commented 1 year ago

Another option is that we provide an api interface to allow user inject their own "call back"(some thing as post action hook or in an AOP programming model of view, a hook after network IO to peer/order), and we implemented a default "call back"... but I suppose which duplicated with gateway client's work.

SamYuan1990 commented 1 year ago

And in most of case, for example, if user want to check a block been confirmed by multiple peers, all operators(not only channel creation) should checked among the same scope. Hence in fact, the "call back" did the similar logic as gateway sdk but targeting on multiple peers. As a summary, I suppose as this logic already covered by gateway sdk, hence we don't need a duplicate implementation in admin-sdk.

SamYuan1990 commented 1 year ago

I just created a pr to see what if we removed sleep in our e2e test case. for safety reason, we'd better provide a check mechanism.

SamYuan1990 commented 1 year ago

I was double checked fabric sample today https://github.com/hyperledger/fabric-samples/blob/67ae2c9d025907840603cf38eaf3b39e511c9d19/test-network/scripts/ccutils.sh#L105-L115

I am not sure if a sleep wait is a best option for us or not? as we have to wait for 1st block success, which means we need to check the block height or channel config block height through hole blockchain network?

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

hyperledger / fabric-admin-sdk

Avoid need for sleeps after creating channel using API #100