essential-contributions / pint

Pint, the constraint-based programming language for declarative blockchains
Apache License 2.0
17 stars 4 forks source link

How to express ERC20 token balances in Yurt #23

Closed mohammadfawaz closed 1 year ago

mohammadfawaz commented 1 year ago

Similarly to the questions we asked in #22, how should ERC20 token balances be expressed in Yurt. Also, how do we generalize this to arbitrary contract calls to view methods (i.e. methods that only inspect the state without modifying it).

Focusing on the ERC20 token balance use case, there are two main questions:

  1. How do express the token balance in Yurt?
  2. How do express the token balance in JSON?

For 1, we can bake in the concept of a token balance in the language, but that's not general enough. Alternatively, we can use a construct similar to Rust's FFI where the contract method balanceOf is defined in an extern block, but we still need to link that to a contract instance (or a contract ID/address).

We can also define balance_of as a method implemented for some struct ERC20 (or maybe in a trait somehow). However, we don't want to provide an implementation for it. Instead, we want to "link" it to the right deployed contract method. We can do that, for example, using an annotation that contains the JSON ABI representation of the balanceOf method.

For 2, the current JSON spec has the following:

    {
      "name": "dai_balance",
      "type": "float",
      "stateAccessed": {
        "account": "0x6B175474E89094C44Da98b954EedeAC495271d0F",
        "value": "call balanceOf(0x1111111111111111111111111111111111111111)"
      }
    },

which is fine but likely still needs a chain ID somewhere. We may also want to look closer into whether the "value" field can be improved.

Recall that there are two main goals for this:

  1. Make the user code readable so that it's clear what a given decision variable means semantically.
  2. Provide hints to the solver to help them solve the intent. If the solver knows that a given decision variable is actually a balance, then they can use them as inputs to a mathematical model of a DEX.
sezna commented 1 year ago

I think a state access list is an ergonomic way of importing chain data. It has two main benefits:

  1. unambiguous references to on-chain data, abstracted into program variables.
  2. batch-ability; infrastructure can determine the minimal state required to execute batches of intents without inspecting entire programs.

In ERC20 balances, this would, as a non-decision variable, simply be a reference to a state index or proxy-read method call on chain.

mohammadfawaz commented 1 year ago

In ERC20 balances, this would, as a non-decision variable, simply be a reference to a state index or proxy-read method call on chain.

Yeah that makes sense. I think there are at least two challenges:

  1. How does that actually look like in the language? Are we going to implement a full mechanism for calling smart contract methods (at least ones that are view)? If so, how do we make this mechanism general enough to cover arbitrary chains?
  2. You say that the balance is a non-decision variable, which is fine for reading the current state. However, we still need a way to encode the future balance which has to be a decision variable because we don't know what that's going to be (we just have constraints over it). This is where the ideas discussed in #25 come into play. Basically, we need to propagate additional information about the future state: a) the actual keys and b) what the state represent (i.e. this is an ERC20 balance).
mohammadfawaz commented 1 year ago

Thinking about:

// This would go in a library `erc20.yrt`
use libsolidity::{address, uint256};

abi IERC20 { 
    fn balanceOf(account: address) -> uint256;
}
// This would go in a library `eth_dai.yrt`
use libsolidity::{address, uint256};
use erc20::IERC20;

let eth_dai_caller: caller = IERC20(1 /* chain ID */, 0x6B175474E89094C44Da98b954EedeAC495271d0F /* Contract ID */); 
// `caller` is a native Yurt type
// This would go in the user code
use libsolidity::{address, uint256};
use eth_dai::eth_dai_caller

let erc20_balance = caller.balanceOf(0x1111111111111111111111111111111111111111 as address) as int;

Inspired by Sway and Solidity. Note

otrho commented 1 year ago

Do we need the as casting really? The let must be typed anyway, and the address is implied by the ABI. If we don't want to implicitly cast literal integers to addresses then they could be declared separately, but an implicit from() for a giant literal int to address isn't hard.

let my_addr: address = 0x1111111111111111111111111111111111111111;
let erc20_balance: int = caller.balanceOf(my_addr);

I'm not 100% into replicating the whole ABI thing in Yurt though. Especially making self calls on ABI objects or structs.

If we were to only care about certain contracts on Ethereum, like ERC-20 or DAI then there's no need to declare the ABI in Yurt as it could be all abstracted away. A bit like the JSON.

If we want to generalise to allow Yurt to call any contract on Ethereum then OK, a declaration of the contract interface needs to be made in terms of FFI to Solidity.

But then when we support other chains the FFI will be to a different platform and language where using an 'ABI' might not fit. I guess I'm thinking we don't want to bring too much foreign language design stuff into Yurt itself.

mohammadfawaz commented 1 year ago

If we want to generalise to allow Yurt to call any contract on Ethereum then OK, a declaration of the contract interface needs to be made in terms of FFI to Solidity.

This is the challenge. I tried to go down the FFI route, but I couldn't figure out how to specify chain and contract IDs. One interface (such as the ERC20 interface) could be "instantiated" (or constructed) with different chain and contract IDs, hence the need for some type of "caller".

For example:

use libsolidity::{address, uint256};

extern {
    fn balanceOf(account: address) -> uint256;
}

How do we determine which balanceOf function to call? We need a way to specify both a chain ID and a contract ID.

Really the requirement here is to uniquely identify a given methods that is deployed on some blockchain. This can be done by specifying three things:

  1. Chain ID: point to some blockchain.
  2. Contract ID: point to some contract. No restriction here on how this looks like. Different chains may have different contract ID types.
  3. ABI method encoding (and hence the selector) which typically requires the name of the method and the types of the arguments.

It's important to note that only a subset of methods will be relevant. These are methods that only read state (i.e. view methods in solidity). We certainly don't want to be able to call methods with side effects.

otrho commented 1 year ago

I see. This is why most FFIs in general purpose languages all use C because it can represent most wacky constructs from more powerful languages in a closer-to-the-machine way.

What if instead of going the FFI route, where Solidity functions are translated directly into Yurt functions, we have a general Ethereum/Solidity call mechanism where everything is an argument?

call_ethereum_contract(contract_id, "balanceOf", account, result);

The solver or whoever would have to know what to do with it, which isn't that different to the FFI option.

The above would be a little clumsy in that I think we'd have to have an enum which covers all Solidity types or something, and we'd have to add variadic function arguments.

fn call_ethereum_contract(contract_id: uint256, fn_name: String, ... var_args: SolidityType) -> CallResult

SolidityType isn't great, and getting the result back as a ref output arg is new. But if we need to pass gas (haha) or whatever else they just become args to this generic call-any-Ethereum-contract function.

Yeah, this feels a little clumsy but keeping the contract calls as general as possible is worthwhile IMO.

mohammadfawaz commented 1 year ago

Thinking about more alternatives:

use libsolidity::{address, uint256};

contract MyContract {
    chain_id: 1,
    contract_id: 0x6B175474E89094C44Da98b954EedeAC495271d0F,

    fn balanceOf(account: address) -> uint256;
}

let my_balance = MyContract::balanceOf("0x1111111111111111111111111111111111111111");

To me, this seems general enough

One can basically create a contract with a given chain ID and contract ID, and define what interface it has, all in the same contract block. Calling the contract methods would be similar to Rust.

Note that all the above serve as a way to read chain state. This seems to be useful on its own to inspect things like liquidity reserves, balances, oracles output, etc. So, in the above, the variable my_balance is a decision variable that is initialized but not actually known at compile-time. The solver still has to make those contract calls somehow.

The other important question here is how to represent the state transition function (see #25) and make it possible for solvers to actually reason about it given the context of the intent (e.g. swap intents).

otrho commented 1 year ago

Yep, I like this more. Introducing a special scope with the chain and contract context, along with the API is nice and general. We may want to expand the context, as other chains might need more fields along with chain_id and contract_id, and the above would work for that.

I like that there's no need to instantiate a callable contract object and essentially call methods on it.

mohammadfawaz commented 1 year ago

So one drawback of

contract MyContract {
    contract_id: 0x6B175474E89094C44Da98b954EedeAC495271d0F,

    fn balanceOf(account: address) -> uint256;
}

is that it does not allow re-using interfaces. For example, if I would like to interact with 10 ERC20 tokens, I will need to stamp out 10 copies of the above with different contract IDs.

I do like having the contract methods explicit in an interface like above, as it makes it clearer to the user what is being called and what types to expect, but the drawback above is not great. I wonder if we can do better somehow:

interface ERC20 {    
    fn balanceOf(account: address) -> uint256;
}

contract DAI(0x6B175474E89094C44Da98b954EedeAC495271d0F) : ERC20 {
    // Inherit `balanceOf` but also allow other methods to be added here.
}

let my_balance = DAI::balanceOf("0x1111111111111111111111111111111111111111");

Basically, interfaces do not require an ID and cannot be called directly. contracts on the other hand may "inherit" from interfaces or define their own methods, but require an ID (different syntax than before for simplicity) and can be called.

otrho commented 1 year ago

IERC20 is a Solidity interface so this makes sense to me. I'm not a fan of the colon for some reason. I'd rather see using or impl or something more literal.