AntelopeIO / spring

C++ implementation of the Antelope protocol with Savanna consensus
Other
5 stars 2 forks source link

a fork removing the start of savanna transition will cause a state history failure with `finality-data-history` enabled #289

Closed spoonincode closed 2 months ago

spoonincode commented 3 months ago

Notice how ship plugin writes to the finality data log, https://github.com/AntelopeIO/spring/blob/391be54bc06b9690d0f2753e59eb9d791ed429d4/plugins/state_history_plugin/state_history_plugin.cpp#L189-L200

If a finality_data_t is returned, it is written to the finality data log. Otherwise, nothing is done with the log.

Remember though that ship log entries must be continuous: there can be no gaps in them.

Consider that savanna transition begins on block 200, and continues for another block 201. This will write two entries in to ship's finality data log: block 200 and 201. Now consider that a new fork is switched to back on block 198, and that this fork continues without savanna activation; say it continues on to block 220.

At this point, ship will continue to indicate a finality data log of blocks 200 through 201 and even return finality data for those blocks even though that never happened from the standpoint of the current fork being followed.

But then consider on block 221 there is another attempt to transition to savanna. This will cause ship to attempt to write to block 221 of the finality log, but this will be flagged as a discontinuity in ship due to the gap between 201 and 221. Ultimately this will cause the node to shut down.

linh2931 commented 3 months ago

Should we stop sending finality_data of transitional blocks, and only send them after transition to Savanna is finalized? Need some mechanism to coordinate SHiP and the controller.

spoonincode commented 3 months ago

My initial thought at two easy ways to potentially resolve the risk:

1) always write out the entire std::optional<finality_data_t>, or 2) add a new clear() to the log/log_catalog, which would look something like,

if(!finality_data.has_value()) {
   finality_data_log->clear();
   return; 
}

(clear() could be quite fast and straightforward if log is already empty() -- just do nothing)

The second option is more "forward looking" I think, since finality_data_t will nominally always be around going forward.