nervosnetwork / fiber

19 stars 10 forks source link

Properly handle inconsistent view of active tlcs while constructing commitment transaction for different parties #73

Closed contrun closed 4 months ago

contrun commented 4 months ago

The problem

It is crucial to have the same list of active tlcs while signing and verifying commitment transaction. Otherwise, we will build different args for commitment lock (active tlcs are part of commitment lock args), thus different messages while signing and verifying signatures.

In https://github.com/nervosnetwork/cfn-node/pull/72, I naively thought we can build the same active tlc list if we have received a RevokeAndAck message from the peer.

        if let ChannelState::ChannelReady(flags) = self.state {
            if flags.contains(ChannelReadyFlags::AWAITING_REMOTE_REVOKE) {
                // Remote has not confirmed our latest commitment transaction.
                // We may have inconsistent view of the active tlcs.
                // This should not happen.
                panic!(
                    "Trying to obtain list of active received tlcs while remote has not confirmed our latest commitment transaction, should have verified we are not in this state before calling this function.",
contrun marked this conversation as resolved.
                )
            }
        }

Example error case where naive solution above fails

This is actually not true. Imagine the following case, say Alice's intention is to add tlc1 to Bob (by sending a Alice_AddTlc to Bob), and commit this tlc1 to a commitment transaction (by sending a commitment message Alice_CommitmentSigned). She can send the following messages to Bob.

Alice_AddTlc, Alice_CommitmentSigned

Bob, who unknowingly also wants to add a tlc2 to Alice. He sends two messages to Alice

Bob_AddTlc, Bob_CommitmentSigned

Depending on the person, these events can have different orders. For example, it is possible for Alice to believe the events are in the following order.

Alice_AddTlc, Alice_CommitmentSigned, Bob_AddTlc, Bob_CommitmentSigned

If this is the case, she may believe that Alice_AddTlc should be added to her current commitment transaction (because it is before the event Alice_CommitmentSigned), while Bob_AddTlc should be added to her next commitment transaction (because it is after the event Alice_CommitmentSigned).

On the other hand, Bob may believe the events are in the following order

Bob_AddTlc, Bob_CommitmentSigned, Alice_AddTlc, Alice_CommitmentSigned

In this case, on receiving Alice_CommitmentSigned, Bob believes that both Bob_AddTlc and Alice_AddTlc should be included in Alice's current commitment transaction. Now we have different views for active tlcs for different person, thus we build different commitment transactions. It is the same for Alice to verify Bob's CommitmenSigned message.

A possible solution

I think this problem can be solved by every time one person (say A) is verifying another person's (say B's) partial signature to a commitment transaction. A first constructs the commitment transaction as usual. If the partial signature to the commitment transaction is invalid, then there are two possibilities. Either B was cheating or B missed some tlc updates from A. If it is the later case, then A must have initiated some tlc update actions (e.g. offered a tlc to B or request to remove a tlc from B) within this period. So we can check A's local channel state to see if there are indeed this kind of actions. If there are, we can try to reconcile stateb updates between A and B. That is, we need to do the same thing as when A and B reestablish channel to make sure both A and B catch up with each other.

This seems to work. But we need to give some more thoughts over this, and pin down some security implications (if any).

contrun commented 4 months ago

This is fixed in https://github.com/nervosnetwork/cfn-node/pull/72. The solution was simply not to include any tlc whose state was ambiguous to both parties, and wait until the other party to send RevokeAndAck message to confirm that this tlc is included in the commitment transaction. In the above example, when verifying Alice's CommitmentSigned message has the correct partial signature, Bob should never include Bob_AddTlc while building the commitment transaction.