Currently, block-store is executing AddBlock() twice for every received block.
Expected behavior
How should the block store receive new blocks?
There are two different design options for what block-store should do:
(a) chain SHOULD NOT be aware of the block store microservice specifically. Therefore, chain should generically send a broadcast.block_accepted message to all microservices that want to do something when a block is accepted. Since block-store happens to be such a microservice, it is the responsibility of block-store to subscribe to broadcast.block_accepted.
(b) chain SHOULD be aware of the block store microservice specifically. Therefore, chain should specifically send a rpc.block_store.add_block_request message to the block_store microservice. Since block-store expects to get blocks from chain via its normal RPC mechanism, block-store doesn't have to subscribe or do anything special to get blocks from chain. Rather, it's the responsibility of chain to send block-store the blocks via a direct RPC.
Expected behavior is to do either (a) or (b). The current code does both (a) and (b) which causes every block to be stored twice.
Steps to reproduce
Add log.Infof("AddBlock(%d)", record.BlockHeight) to AddBlock() and attempt to sync to mainnet. You will see two calls with the same block height for every block.
Environment
- OS: Ubuntu 22.04
Anything else?
This issue also affects chain. Specifically if we choose (a), we should remove the direct block store RPC call from chain.
Whether to choose (a) or (b)? Both seem reasonable at first glance, in fact (a) seems slightly superior as it makes the chain and block-store microservices slightly less tightly coupled. However (b) has the advantage of backpressure:
Not only does chain call block-store RPC, it also waits for a response (important!)
Which means controller::submit_block doesn't return until block-store has finished saving the block
In turn (AFAICT) the p2p peer handler also ties up the peer handler until the chain RPC returns, which means a single peer can only have one submit_block in flight at a time
Basically this means if you're receiving blocks faster than you can save them, with (a) you might get into an unpleasant situation where block-store falls further and further behind an unboundedly growing lump of unsaved broadcast notifications in RabbitMQ. I'm not sure if RabbitMQ will eventually give up and discard some of those notifications (this seems bad, accepted blocks might not be stored), or if the backlog in RabbitMQ will just grow and grow. (It wouldn't be truly unbounded; you'll reach the current head block at some point, and then new blocks will only arrive as they're produced, which means the rate of incoming blocks should fall to something manageable and the RabbitMQ backlog should start to shrink. But if Koinos history grows to the same size as, say, Steem / Hive, this effect could easily result in many millions of block notifications stuck in RabbitMQ.)
With (b) by contrast, when receiving blocks faster than you can save them, AFAICT the excess will pile up in p2p which will hopefully handle things more gracefully (e.g. we'll just stop requesting more blocks from a peer if we still have a lot of unprocessed blocks from that peer).
Is there an existing issue for this?
Current behavior
Currently,
block-store
is executingAddBlock()
twice for every received block.Expected behavior
How should the block store receive new blocks?
There are two different design options for what
block-store
should do:chain
SHOULD NOT be aware of the block store microservice specifically. Therefore,chain
should generically send a broadcast.block_accepted message to all microservices that want to do something when a block is accepted. Sinceblock-store
happens to be such a microservice, it is the responsibility ofblock-store
to subscribe tobroadcast.block_accepted
.chain
SHOULD be aware of the block store microservice specifically. Therefore,chain
should specifically send a rpc.block_store.add_block_request message to theblock_store
microservice. Sinceblock-store
expects to get blocks fromchain
via its normal RPC mechanism,block-store
doesn't have to subscribe or do anything special to get blocks fromchain
. Rather, it's the responsibility ofchain
to sendblock-store
the blocks via a direct RPC.Expected behavior is to do either (a) or (b). The current code does both (a) and (b) which causes every block to be stored twice.
Steps to reproduce
Add
log.Infof("AddBlock(%d)", record.BlockHeight)
toAddBlock()
and attempt to sync to mainnet. You will see two calls with the same block height for every block.Environment
Anything else?
chain
. Specifically if we choose (a), we should remove the direct block store RPC call fromchain
.Whether to choose (a) or (b)? Both seem reasonable at first glance, in fact (a) seems slightly superior as it makes the chain and block-store microservices slightly less tightly coupled. However (b) has the advantage of backpressure:
chain
callblock-store
RPC, it also waits for a response (important!)controller::submit_block
doesn't return untilblock-store
has finished saving the blocksubmit_block
in flight at a timeBasically this means if you're receiving blocks faster than you can save them, with (a) you might get into an unpleasant situation where
block-store
falls further and further behind an unboundedly growing lump of unsaved broadcast notifications in RabbitMQ. I'm not sure if RabbitMQ will eventually give up and discard some of those notifications (this seems bad, accepted blocks might not be stored), or if the backlog in RabbitMQ will just grow and grow. (It wouldn't be truly unbounded; you'll reach the current head block at some point, and then new blocks will only arrive as they're produced, which means the rate of incoming blocks should fall to something manageable and the RabbitMQ backlog should start to shrink. But if Koinos history grows to the same size as, say, Steem / Hive, this effect could easily result in many millions of block notifications stuck in RabbitMQ.)With (b) by contrast, when receiving blocks faster than you can save them, AFAICT the excess will pile up in p2p which will hopefully handle things more gracefully (e.g. we'll just stop requesting more blocks from a peer if we still have a lot of unprocessed blocks from that peer).