near / rainbow-bridge-sol

10 stars 8 forks source link

Explore not storing approvals in the client #14

Open MaksymZavershynskyi opened 4 years ago

MaksymZavershynskyi commented 4 years ago

Currently, every time the block is submitted we are storing its approvals_after_next which is a lengthy and expensive list. We only need this list when we challenge the block. The reason we are storing it is so that when challenge method call is computed is uses little gas, because the approval that it challenges is readily available.

However, this increases the cost of submitting the header to the client. We should explore the approach that we had in the early days of the bridge:

abacabadabacaba commented 4 years ago

Do you mean that the entire list of signatures should still be passed to the contract, just not stored? BTW, we can do the same with the list of block producers.

MaksymZavershynskyi commented 4 years ago

Do you mean that the entire list of signatures should still be passed to the contract, just not stored?

Yes. Upon header submission it is passed and used for hash computation (both light client header hash and the hash of the binary blob).

BTW, we can do the same with the list of block producers.

Depends on whether it will actually requires less gas. This will definitely make challenge more expensive.

abacabadabacaba commented 4 years ago

Depends on whether it will actually requires less gas. This will definitely make challenge more expensive.

It should. Currently, I suspect that most gas during normal operation is used for storage writes, and most of those are for signatures and validators. It may even reduce gas usage for challenges, because passing data through calldata is cheaper than reading it from storage.

abacabadabacaba commented 4 years ago

In fact, we can defer pretty much all validation until the challenge comes. Just store the merkle root that we are going to serve to the users, and the hash of data. Only when a challenge comes, parse data and optionally check one of the signatures.

MaksymZavershynskyi commented 4 years ago

What if someone submits a header that it completely mangled? It is not stored in the contract, except its hash, so challengers need to solve pre-image problem to be able to challenge it.

Now that I am thinking about it, it actually means the only thing we can afford to not store in the contract are the approvals, because if anything else is missing from the contract state then it becomes a pre-image problem.

abacabadabacaba commented 4 years ago

I investigated the possibilities for getting the data that was passed to smart contracts. I found that while public API providers like Infura don't provide the necessary information, it can be easily obtained from a locally running geth node. The private debug API seems to provide the information that we need, so there is no need to patch geth. Running a local node can also provide better reliability.

MaksymZavershynskyi commented 4 years ago

Then together with the submitted header hash we should be storing the hash of the transaction that was used to submit this block to make it easy to recover. FYI Infura can be quite unreliable: https://github.com/near/rainbow-bridge-cli/issues/330

abacabadabacaba commented 4 years ago

AFAIK a contract cannot get the hash of current transaction. With a local node, recovering the necessary data shouldn't be difficult anyway, and we want to use a local node even without this optimization for the sake of reliability. So I don't see a big problem here.

MaksymZavershynskyi commented 4 years ago

@abacabadabacaba Do I understand correctly that your current suggestion is to allow relayers to submit the entire headers, then EthOnNearClient would deserialize them, extract header merkle root, and store merkle root together header height? Or do you suggest to not perform header deserialization in addLightClientBlock at all?

abacabadabacaba commented 4 years ago

I suggest to not deserialize it at all. Merkle root can be submitted in a separate argument. If it doesn't correspond to the serialized block, this can be challenged.

MaksymZavershynskyi commented 4 years ago

So you are suggesting for the addLightClientBlock to accept serialized header and merkle root. The contract then would compute keccak256 of the serialized header and store it instead. The challenger then would present the serialized header (and a signature id if they want to challenge specific signature) and the challenge code would have to deserialize it and check the signature. However, our deserialization code needs to be written with additional attack angles in mind. For instance if we are trying to deserialize sequence from borsh, then the first 4 bytes in borsh will indicate the length of the sequence, and some can pass there a very large number causing the contract to attempt to allocate a very large amount of memory.

abacabadabacaba commented 4 years ago

Yes, that's the plan. Current deserialization code isn't suitable for this because it reverts on invalid data. Also, the code will need to be designed to protect against inputs that could cause it to use excessive gas (such as having high length values, as you pointed out), and also a limit on the overall length of the serialized data should be enforced in addLightClientBlock because otherwise an attacker can make the watchdog submit an overly long transaction that may be difficult to get included in a block.

MaksymZavershynskyi commented 4 years ago

Discussed with @abacabadabacaba . It is not clear how much time will it take to implement this issue. It is also not clear how to split this issue in more time-boxed work items, so leaving it without estimate for now.