piplabs / story

Official repo for the Story L1 consensus client, contracts, and associated tooling.
GNU General Public License v3.0
27 stars 14 forks source link

Investigate what can fail during CompleteUnbonding #193

Open 0xHansLee opened 2 days ago

0xHansLee commented 2 days ago

Description and context

In Sep 26's network halt incident, we uncovered that there may be a mismatch between what we think cosmos would unbond vs what cosmos actually unbonded.

So we need to investigate and list all the scenarios when CompleteUnbonding doesn't unbond an unbending entry. This can help us identify the root cause of the network halt.

Experienced behavior

Some unbonding did not complete properly, which resulted in the spendable amount being less than the unbonding amount and prevented the withdrawal from occurring properly.

Expected behavior

If the unbonding is not complete, the withdrawal corresponding to that unbonding should not be processed.

Solution recommendation

When withdrawing in the EndBlock of the evmstaking module, double check that the corresponding unbonding is complete and only process the withdrawal for completed unbonding.

0xHansLee commented 2 days ago

Unfortunately, no log was printed when an error occurs in completing unbonding in comsos-sdk. It just skip the failed unbonding and continue to try to complete unbonding. Thus, it was hard to figure out what caused this incident in CompleteUnbonding. Instead, I listed the potential points of error in CompleteUnbonding.

Before listing the possible error points in CompleteUnbonding, I will briefly explain the workflow of CompleteUnbonding.

First, the unbonding delegation is retrieved using the delegator address and validator address. The fetched unbonding delegation contains an array of UnbondingDelegationEntry structures, representing the delegations that the delegator has unbonded from a specific validator. Each entry has a unique ID that increments by 1.

Next, the process iterates over the entries that have matured and are not held in any external module. The unbonding is processed in a for loop. Entries that are completed are removed, and the unbonding index is deleted. The unbonding index is a mapping between the entry ID and the unbonding key, which is composed of the delegator address and validator address. Afterward, the undelegation is processed from the NotBondedPoolName module to the respective account. If no entries remain, the unbonding delegation is deleted. If entries remain, the unbonding delegation is updated.

The potential error points are as follows:

1. GetUnbondingDelegation: Fetch the unbonding delegation.

  1. Key is nil (key := types.GetUBDKey(delAddr, valAddr)).
  2. Fetched value is nil (ErrNoUnbondingDelegation).
  3. UnmarshalUBD error.

2. BondDenom: Retrieve bond denom from params.

  1. Key is nil (key := types.ParamsKey).
  2. Unmarshal error.

3. StringToBytes(delegatorAddress) in AddressCodec: Convert the string address to a byte array.

4. DeleteUnbondingIndex: Delete the index for the entry’s UnbondingId from the store.

  1. Error occurs in store.Delete if the key is nil.

5. UndelegateCoinsFromModuleToAccount: Errors that occur within the UndelegateCoins function are returned.

  1. The module account retrieved is nil (module: NotBondedPoolName).
  2. The amount is invalid (non-positive, invalid denom, duplicate denom, unsorted denom).
  3. Errors in subUnlockedToken: removing unlocked tokens from the module account.
  1. trackUndelegation: If the account is a vesting account, track undelegation.
  1. addCoins: Add to the delegator’s account.

6. RemoveUnbondingDelegation: When no unbonding entries remain.

  1. Error in converting delegatorAddress to bytes in AddressCodec (similar to point 3).
  2. Error in converting validatorAddress to bytes.
  3. Error occurs when deleting the key from the store (key := types.GetUBDKey(delegatorAddress, ValidatorAddress)).
  4. Error occurs when deleting the key from the store (key := types.GetUBDByValIndexKey(delegatorAddress, ValidatorAddress)).

7. SetUnbondingDelegation: When unbonding entries remain.

  1. Error in converting delegatorAddress to bytes in AddressCodec (similar to point 3).
  2. Error in converting validatorAddress to bytes.
  3. Error occurs when setting the key in the store (key := types.GetUBDKey(delegatorAddress, ValidatorAddress)).
  4. Error occurs when setting the key in the store (key := types.GetUBDByValIndexKey(delegatorAddress, ValidatorAddress)).