GridPlus / gridplus-sdk

SDK for communicating with the GridPlus Lattice1 hardware wallet
MIT License
50 stars 24 forks source link

Add check on full ABI in `replaceNestedDefs` rather than always requesting 4byte #476

Open alex-miller-0 opened 2 years ago

alex-miller-0 commented 2 years ago

Background

In getNestedCalldata, we assume any EVM param of type bytes or bytes[] may contain nested call(s). This function essentially just loops through every param and, if it is bytes or bytes[] type, copies the data into a new array. For all other types, the item in the new array is skipped (assigned null). So the response is an array containing potential calldata associated with each param of the main function.

After getting this array of potential nested calldata, we pass it to replaceNestedDefs. The job of this function is to look at each piece of nested calldata and attempt to look up the function definition from 4byte. The response of this function is another array mapping the nested calldata params to nested definitions, or null if no def is found (or there is no possible nested calldata).

Note that in replaceNestedDefs we must make N calls to 4byte, even though the nested defs may be contained in the original ABI if it was fetched from Etherscan. This means that regardless of what is happening in the multicall pattern, we can only display param names of [#1], [#2], etc on the Lattice. This is because while Etherscan provides param names, 4byte does not.

This approach may seem less than ideal but it was done to avoid being over-prescriptive on a multicall pattern. Consider the following made up example:

multicall_out(address[] outContracts, bytes[] calldata)

In this example, the multicall_out could be doing a sanity check to ensure outContracts.length == calldata.length and then proceeding to call out to external contracts (outContracts) with the respective calldata items. In order to display param names for the nested defs, we would need to know in advance that each outContracts item must map to calldata, i.e. we would need to know how multicall_out works, which would require analysis of the source code. This is far far out of scope for our ABI decoding feature and would add a ton of complexity.

In the above example, there would be no useful information in the ABI from which multicall_out came because the function is calling to external contracts, and the ABI from which we got multicall_out would only include definitions for functions in that specific contract. However, consider Uniswap's multicall pattern (see this tx):

multicall(bytes[] data)

If you look up the contract source code on Etherscan, you can see that each of the calls made using this pattern is internal to the same contract. In this case, we could consult the ABI from which we found multicall to then lookup each of the defs in the data array. This would result in displayable param names for each nested call. Also, it only requires a single request to Etherscan so we don't need to worry about rate limits.

Proposed solution

The above examples show two different potential patterns. Currently, these cases are treated the same so, we do not cover the latter example as best we could.

As a guiding principle, we should strive to display param names whenever possible in nested calldata, meaning we should be able to display param names in the second case (Uniswap).

One way to accomplish this is by modifying replaceNestedDefs to take a second param, which is the full ABI from which the original def came. It would then do the following:

  1. For each item in the possible nested calldata, grab nested selector in _nestedSelector (same as current function)
  2. Given the possible selector, loop through all functions returned in the full contract ABI
  3. If a given selector is found in the ABI, add that def to nestedDefs
  4. If a given selector is not found in the ABI, look up the selector in 4byte and convert it to a def (same as current function)

NOTE: The above logic should be the same regardless of bytes or bytes[] type (i.e. array or not).

This solution should be better than what we currently have because it would provide more decoder information (param names) if it can, but otherwise falls back to current behavior.

alex-miller-0 commented 2 years ago

Here is an example of a tx to a proxy contract that does nested calls out to an external contract: https://etherscan.io/tx/0x2af244a02066c0a8e3998d247e071a03cd38ecbec60f93ddf63da0dce3932f86

If you look at the source code, this proxy contract routes all calls through masterCopy, which is set at init time (here is the init tx).

In this case, masterCopy maps to this contract. You can look through the source code and you will find that execTransaction has the following def:

    function execTransaction(
        address to,
        uint256 value,
        bytes calldata data,
        Enum.Operation operation,
        uint256 safeTxGas,
        uint256 baseGas,
        uint256 gasPrice,
        address gasToken,
        address payable refundReceiver,
        bytes calldata signatures
    )

It would be nice to display all of these param names, but there is no generalizable way to get to this definition because we don't know the address of this nested contract (defined as masterCopy). The only way to get it would be to know in advance that we need to do a lookup of masterCopy, i.e. inspect the init tx.

You can imagine many similar but different patterns, even as simple as naming masterCopy something more like proxyAddress, or as complicated as having multiple types of execTransaction that go to different proxy addresses. Anyway there is too much possible bespoke logic for us to handle, and this is why we must simply fetch the function definition from 4byte if we don't already have it in our original ABI.