Optimize protocol design

Refer to protocol design v0.4

The points i try to optimize are as follows:

Simplify on-chain interactions when unilateral closing channel There are 5 stages in design v0.4: opening -> closing -> clearing -> finalized -> transfer-assets. When unilateral closing channel, 4 types of transactions and at least 4 transactions need be constructed and submitted. I think it's possible to optimize it to: opening -> closing -> settlement, simplify the interaction flow.
Reduce on-chain data usage

Here are the details:

Open channel

- inputs
    - capacity provide cell
        - capacity: 100 CKB
        - lockscript: A
    - capacity provide cell
        - capacity: 200 CKB
        - lockscript: B
    - other cells to provide channel cell capacity and tx fee
- outputs
    - channel cell
        - typescript: PCT
            - args: channel-id
        - lockscript: anyone-can-unlock
        - data
            - fixed
                - pubkey_hashes: [pubkey_hash of A, pubkey_hash of B]
                - closing_period: 1000
            - dynamic
                - state: OPENING
                - version: 0
                - balances: [100 CKB, 200 CKB]
                - remaining_update_times: [5, 5]
    - asset cell
        - capacity: 300 CKB
        - lockscript: PCL

The differences from v0.4:

Remove asset field. Because we already have asset cell, i think it's better to get asset-info from asset cell.
Change challenge_period and clear_period to closing_period, closing_period >= max(challenge_period, clear_period).
Change update_time_limit and update_times to remaining_update_times.

Unilateral close channel

-header_deps
  - header_contains_htlc0_proof_cell
- cell_deps
  - htlc0_proof_cell 
- inputs
    - old channel cell
        - data
            - dynamic
                - state: OPENING
                - version: 0
                - balances: [100 CKB, 200 CKB]
                - remaining_update_times: [5, 5]
- outputs
    - new channel cell
        - data
            - dynamic
                - state: CLOSING
                - version: 100
                - balances: [160 CKB, 100 CKB]
                - htlcs_merkle_root
                - proved_htlcs: 0x01
                - remaining_update_times: [4, 5]
- witnesses
    - UnilateralCloseChannel
        - data
            - channel_id
            - version: 100
            - balances: [50 CKB, 100 CKB]
            - htlcs_merkle_root
                - htlc0_merkle_proof
                  - to: 0
                  - amount: 110 CKB
                  - lock
                  - preimage_timeout: 100
        - signatures
        - submitter_signature of A

The differences from v0.4:

Channel_cell_data will not record htlcs-info, but record htlcs_merkle_root and proved_htlcs(in case of duplicated proved_htlc) instead. proved_htlcs records which htlcs have been proved by proof_cell, and the data type is bitmap(e.g. we have 8 htlcs, and htlcs[0, 2] have been proved, then the proved_htlcs is one byte 0x05).
When submitting off-chain commitment, participant should include htlcs-prove-data(header_deps, cell_deps), and all-proved-htlcs will be settled directly(pay attention to tx_outputs/channel_cell_data/balances).

Counterparty update channel

-header_deps
  - header_contains_htlc1_proof_cell
- cell_deps
  - htlc1_proof_cell 
- inputs
    - old channel cell
        - data
            - dynamic
                - state: CLOSING
                - version: 100
                - balances: [160 CKB, 100 CKB]
                - htlcs_merkle_root
                - proved_htlcs: 0x01
                - remaining_update_times: [4, 5]
- outputs
    - new channel cell
        - data
            - dynamic
                - state: CLOSING
                - version: 100
                - balances: [160 CKB, 120 CKB]
                - htlcs_merkle_root
                - proved_htlcs: 0x03
                - remaining_update_times: [4, 4]
- witnesses
  - htlc1_merkle_proof
    -  to: 1
    - amount: 20 CKB
    - lock
    - preimage_timeout: 100
  - submitter_signature of B

The differences from v0.4:

Participant B submitted htlc1_merkle_proof (Participiant A not submitted for B). And htlc1 was directly settled in this transaction(pay attention to tx_outputs/channel_cell_data/balances).

Settle channel

- inputs
    - assets cell
        - capacity: 300 CKB 
        - lock: PCL
    - old channel cell
        - since: after  closing period 
        - data
            - dynamic
                - state: CLOSING
                - version: 100
                - balances: [160 CKB, 120 CKB]
                - htlcs_merkle_root
                - proved_htlcs: 0x03
                - remaining_update_times: [4, 4]
- outputs
    - asset cell 0
        - capacity: 180 CKB
    - asset cell 1
        - capacity: 120 CKB
- witnesses
  - htlc2_merkle_proof
    -  to: 1
    - amount: 20 CKB
    - lock
    - preimage_timeout: 100

The differences from v0.4:

After closing period, participant can settle channel.
Unproved_htlcs(should refund) can be settled here. If assets cell are exhausted(all unproved_htlcs have been settled), the channel cell can be destroyed. Otherwise, tx ouputs should keep channel cell for the remaining unsettled unproved_htlcs.

update_times_limit should only be limited to version update

The design here mix version update and htlc update together, which may bring some problems.

If we have thousands of htlcs in one channel, due to the tx size limit, we can not provide all proofs in one tx. The limit may be exhausted by providing htlc proof. And it's hard for us to estimate the update_limit_times in this scenario.

proved_htlcs

Merkle proof may not contain index information. Then we may not be able to use bitmap to store it. Also, the merkle root does not contain the length info, then it may be a problem to know the length of the bitmap.
When we update version, we have to reset proved_htlcs. That should be programed carefully.

finalized status

The finalized status is used to simply the PCL logic. With the finalized status, the PCL can only verify the channel-id, status and balance to settle the assets. If we do not have that status, there will be more logic in the PCL.

In bidirectional closing, we have to verify the witness to settle.
In unidirectional closing, we have to verify both channel data and htlcs.

deposit and withdraw

In the original design, we can provide the deposit and withdraw proof in the clearing period. It is a little bit like clear htlc, so the remain_update_times will also be a limit for this.

update_times_limit should only be limited to version update

Either way, we need estimate period. There is no essential difference from estimate updata_times in htlcs case. We should write test cases and estimate how many htlcs can be solved in one transaction. And i'm positive on the result.

proved_htlcs

Merkle proof contains the index, so we can construct the bitmap, the length of bitmap is dynamic according to the biggest index. (we don't need to know the length of htlcs)
Yep, but the logic is straight.

finalized status

The finalized stage can't reduce logic in a global perspective, but bring more transactions and stages. We can put the logic in PCT to keep PCL simple.

deposit and withdraw

Yeah, it's similar to htlcs, and we can put it in testcases.

asset field

Remove asset field. Because we already have asset cell, i think it's better to get asset-info from asset cell.

This may bring security issue. Because creating asset cell with PCL will not be verified(CKB only verify inputs lockscript). If we locked 100 ckETH in the asset cell, and finish the payment. Then I can lock 100 CKB to the same PCL and destroy channel cell with clearing the 100 CKB asset cell. Then the 100 ckETH cell will be locked forever.

proved_htlcs

Unproved_htlcs(should refund) can be settled here. If assets cell are exhausted(all unproved_htlcs have been settled), the channel cell can be destroyed. Otherwise, tx ouputs should keep channel cell for the remaining unsettled unproved_htlcs.

We can not use the condition If assets cell are exhausted, since the same reason that creating asset cell with PCL will not be verified. We can use the proved_htlcs to do this. When we sign commitment offchain, we can include proved_htlcs field, whose value is the length of 0 bits, and fits into bytes with leading 1. E.g. for 9 htlcs, it will be initiated as [0b11111110, 0b00000000]. The end condition can be proved_htlcs turns into all 1s.

update_times_limit

Consider this scenario, the update_times_limit is 10, then you can hire 9 watchtowers and randomly send your latest commitment to them to prevent one watchtower knowing your payment pattern. When you are offline, the legal action that a watchtower can take is submit the latest commitment they have if your counter-party submit an outdated commitment. The rest times is left for yourself. In this case, you can know exactly how many watchtowers you can hire with the update_times_limit in your channel data.

In the design here, since clear htlc, deposit/withdraw actions share the update times, one watchtower may exhausted all the times with legal actions. It may cause you fail to submit the latest commitment and lose money.

Also, it may cause watchtowers losing tx fees to clear htlc for an outdated commitment.

merkle proof

The classic merkle proof only ensures one hash exists in a merkle root. Though some merkle proof methods may includes index, I wonder whether it will bring problems in our use case.

For example, we have 3 htlcs, and we use the classic merkle proof to build the merkle root.

             merkle-root
          /               \
        /                   \
hash(htlc1, htlc2)     hash(htlc3, htlc3)
    /      \               /         \
htlc1     htlc2         htlc3      htlc3

Since the number is odd, we use htlc3 twice to calculate the merkle root. It may cause we get htlc money twice here.

We may have to specify a merkle proof SPEC or even implementation for this discussion.

Some refs

The general idea about removing some status to reduce the complexity is great.

Here are my thoughts.

Remove asset field. Because we already have asset cell, i think it's better to get asset-info from asset cell.

It can not work.

Change challenge_period and clear_period to closing_period, closing_period >= max(challenge_period, clear_period). Change update_time_limit and update_times to remaining_update_times.

That's OK.

The watchtower problem can be solved if we have some rules with watchtowers.

If the watchtower see some counter-party submit an outdated commitment, it should submit the latest commitment it has, along with all proofs we can get right now(htlc, deposit, withdraw).
It waits long enough to collect all left proofs(includes the htlc proofs to refund your counter-party, since the channel can only be settled when all htlcs are proved) and submit an update tx.
When the challenge time is past, it submit the settle tx with all htlc timeout proof.

In this way, every watchtower will submit at most 2 update tx and 1 settlement tx. Then we can estimate the max watchtower number we can use by the update_times_limit.

There is still one problem: If there are too many htlcs that we can not submit all the proofs in one transaction. We may fail to settle the channel.

We can enable the PCT to update other than settle even after the closing period. It will solve the extreme case, with the price of longer closing time.
We can test the max number of htlcs we can put in one tx, and set a proper htlc limit in offchain module.

Channel_cell_data will not record htlcs-info, but record htlcs_merkle_root and proved_htlcs(in case of duplicated proved_htlc) instead. proved_htlcs records which htlcs have been proved by proof_cell, and the data type is bitmap(e.g. we have 8 htlcs, and htlcs[0, 2] have been proved, then the proved_htlcs is one byte 0x05). When submitting off-chain commitment, participant should include htlcs-prove-data(header_deps, cell_deps), and all-proved-htlcs will be settled directly(pay attention to tx_outputs/channel_cell_data/balances).

Consider the proved_htlcs design in previous comment.

Additional work

Redesign the witness structure to express the mix of channel data and htlc proofs.
Specify the different cases and associated actions
- update commitment version
  - reset proved_htlcs
  - fields like includes_deposit and includes_withdraw should be reset to the new commitment version.
- only update htlcs
- settle
- update htlcs after the closing period
- etc
specify a merkle proof SPEC we may used, confirm that it can be used to prove both index and existence.

asset field

Indeed, we have to keep this field.

proved_htlcs

E.g. for 9 htlcs, it will be initiated as [0b11111110, 0b00000000]

Ok, or we can just use htcls_length: uint16 = 9 in off-chain commitment, and if the last 9 bits in bitmap(on-chain cell_data) change to 1, we thought that all htlcs are settled.

update_times_limit

I did not take this situation(watch-tower) into consideration. In my opinion, it's better not to delegate htlcs_related_commitment to watch-tower. When the end user start a htlc_payment, he should stay on-line until the payment is delivered. Because the payment time is short, and to delegate commitment to watch-tower in such a short time is not necessary. Even though it's necessary (e.g. for security), we can limit the max number of concurrent htlcs for end user, and i think 10 is enough (for end user).

merkle proof

You are right. And merkle-cbt seems good to us for now.

Additional work

I can update the design to v0.5

nervosnetwork / fiber-archive