AppLayerLabs / bdk-cpp

MIT License
7 stars 12 forks source link

EVM Integration #102

Closed itamarcps closed 4 months ago

itamarcps commented 4 months ago

EVM Integration

This pull request integrates OrbiterSDK with EVMOne (C++ EVM) VM, marking a significant enhancement in our platform's capability. It allows the execution of Solidity smart contracts alongside native C++ code within the OrbiterSDK environment, enabling more versatile and powerful software solutions.

This pull request is substantial as it introduces extensive modifications across the codebase. One of the key changes involves the ContractManager class, which was previously responsible for managing contracts and their execution. We have now restructured this responsibility, with the State now owning the contracts. The ContractHost, along with the ContractStack, takes charge of managing contract executions, while the ContractManager's role has shifted primarily to contract creation.

Changelog

A concise overview of the key changes implemented in this pull request includes:

Concepts and Terminology

Before delving into the technical details, it's essential to understand some key concepts and terminologies that underpin the integration of EVM within the OrbiterSDK environment.

What is the State?

The term "State" refers to the continuously updated record of all the data across the entire network at any given point in time. This includes information about all accounts and their balances, contract code, and contract storage. A list of what constitutes the state in OrbiterSDK includes:

Basically, the entire State::accounts_ map in OrbiterSDK represents the state of the network.

Understanding External Calls vs. Internal Calls and chain of execution

In the context of interacting with smart contracts, calls can be classified into two types: external calls and internal calls.

Basically, a External call creates a new chain of execution, while internal calls only exists within a given chain of execution, internal calls cannot create new chains of execution.

A chain of execution is a sequence of calls that are executed in a given external call, it is always composed with a initial external call and a set of internal calls.

EVM/ContractHost Implementation Details

This section provides detailed insights into the implementation of the ContractHost within the OrbiterSDK integrated with the EVMOne VM.

State Management and VM Instance Creation

The VM is owned and instantiated by the State class, reflecting a crucial design decision to centralize the management of virtual machine resources. This centralization ensures that each execution context is cleanly managed and isolated. Whenever a new transaction or contract call needs to be executed, regardless of its nature (be it an EVM/C++ contract execution or a simple native transfer), the State class is responsible for instantiating a new ContractHost object with the relevant parameters required for execution:

    ContractHost(evmc_vm* vm,
                 EventManager& eventManager,
                 const Storage& storage,
                 const evmc_tx_context& currentTxContext,
                 std::unordered_map<Address, NonNullUniquePtr<Account>, SafeHash>& accounts,
                 std::unordered_map<StorageKey, Hash, SafeHash>& vmStorage,
                 const Hash& txHash,
                 const uint64_t txIndex,
                 const Hash& blockHash,
                 int64_t& txGasLimit)

Once an instance of ContractHost is created, it offers methods like execute() to run the contract, simulate() for simulating the transaction (useful for gas estimation), and ethCallView() for making calls to other contracts within a non-state-changing context.

Overridden Functions from evmc::Host

ContractHost extends the functionalities of evmc::Host by overriding several key functions that interface directly with the Ethereum Virtual Machine (EVM), which are obligatory for the VM to interact with the state:

    bool account_exists(const evmc::address& addr) const noexcept final;
    evmc::bytes32 get_storage(const evmc::address& addr, const evmc::bytes32& key) const noexcept final;
    evmc_storage_status set_storage(const evmc::address& addr, const evmc::bytes32& key, const evmc::bytes32& value) noexcept final;
    evmc::uint256be get_balance(const evmc::address& addr) const noexcept final;
    size_t get_code_size(const evmc::address& addr) const noexcept final;
    evmc::bytes32 get_code_hash(const evmc::address& addr) const noexcept final;
    size_t copy_code(const evmc::address& addr, size_t code_offset, uint8_t* buffer_data, size_t buffer_size) const noexcept final;
    bool selfdestruct(const evmc::address& addr, const evmc::address& beneficiary) noexcept final;
    evmc::Result call(const evmc_message& msg) noexcept final;
    evmc_tx_context get_tx_context() const noexcept final;
    evmc::bytes32 get_block_hash(int64_t number) const noexcept final;
    void emit_log(const evmc::address& addr, const uint8_t* data, size_t data_size, const evmc::bytes32 topics[], size_t topics_count) noexcept final;
    evmc_access_status access_account(const evmc::address& addr) noexcept final;
    evmc_access_status access_storage(const evmc::address& addr, const evmc::bytes32& key) noexcept final;
    evmc::bytes32 get_transient_storage(const evmc::address &addr, const evmc::bytes32 &key) const noexcept final;
    void set_transient_storage(const evmc::address &addr, const evmc::bytes32 &key, const evmc::bytes32 &value) noexcept final;

These methods manage everything from account validation to logging, providing access to the state and storage, and handling calls between contracts. The ContractHost class encapsulates these functions, ensuring that each contract execution is isolated and secure.

CPP To Other Contract Calls

The ContractHost class employs templated functions to support flexible and efficient interaction with contracts. These templates enable passing any combination of arguments and return types (including void) to and from contracts. This use of templates helps to leverage the fast ABI encoding/decoding processes, ensuring optimal performance and flexibility during contract execution:


    template <typename R, typename C, typename... Args> R 
    callContractViewFunction(
        const BaseContract* caller, 
        const Address& targetAddr, 
        R(C::*func)(const Args&...) const, const
        Args&... args) const;

    template <typename R, typename C> R 
    callContractViewFunction(
        const BaseContract* caller, 
        const Address& targetAddr, 
        R(C::*func)() const) const;

    template <typename R, typename C, typename... Args>
    requires (!std::is_same<R, void>::value)
    R callContractFunction(
      BaseContract* caller, const Address& targetAddr,
      const uint256_t& value,
      R(C::*func)(const Args&...), const Args&... args
    )

    template <typename R, typename C, typename... Args>
    requires (std::is_same<R, void>::value)
    void callContractFunction(
      BaseContract* caller, const Address& targetAddr,
      const uint256_t& value,
      R(C::*func)(const Args&...), const Args&... args
    )

    template <typename R, typename C, typename... Args>
    requires (!std::is_same<R, void>::value)
    R callContractFunction(
      BaseContract* caller, const Address& targetAddr,
      const uint256_t& value,
      R(C::*func)(const Args&...), const Args&... args
    )

    template <typename R, typename C>
    requires (!std::is_same<R, void>::value)
    R callContractFunction(
      BaseContract* caller, const Address& targetAddr,
      const uint256_t& value, R(C::*func)()
    )

This approach allows for dynamic interaction with contracts without pre-defining all possible function signatures, accommodating various contract behaviors and states dynamically.

EVM To Other Contract Calls

For calls from the EVM to another contract, the ContractHost::call() function plays a crucial role. It is tasked with creating and handling calls to other contracts, encapsulating the complexity of contract interaction within a simple interface:

    evmc::Result call(const evmc_message& msg) noexcept final;

Executing Contract Calls via EVMC

To execute a contract call within the EVM environment, the evmc_execute function from the EVMC library is utilized. This function orchestrates the execution of contract bytecode, interfacing directly with the Ethereum Virtual Machine:

static inline struct evmc_result evmc_execute(struct evmc_vm* vm,
                                              const struct evmc_host_interface* host,
                                              struct evmc_host_context* context,
                                              enum evmc_revision rev,
                                              const struct evmc_message* msg,
                                              uint8_t const* code,
                                              size_t code_size);

It's crucial to understand that the VM itself is stateless—it does not maintain any information about the contracts' state or their data. The VM's role is strictly to interpret and execute bytecode according to the Ethereum protocol specifications.

To enable the stateless VM to interact with the state (such as VM storage keys, account balances, or initiating further contract calls), we must provide it with access to the state through the evmc_host_interface and evmc_host_context. The evmc_host_interface contains a set of callback functions that the VM can use to query or modify the state:

struct evmc_host_interface
{
    evmc_account_exists_fn account_exists;
    evmc_get_storage_fn get_storage;
    evmc_set_storage_fn set_storage;
    evmc_get_balance_fn get_balance;
    evmc_get_code_size_fn get_code_size;
    evmc_get_code_hash_fn get_code_hash;
    evmc_copy_code_fn copy_code;
    evmc_selfdestruct_fn selfdestruct;
    evmc_call_fn call;
    evmc_get_tx_context_fn get_tx_context;
    evmc_get_block_hash_fn get_block_hash;
    evmc_emit_log_fn emit_log;
    evmc_access_account_fn access_account;
    evmc_access_storage_fn access_storage;
    evmc_get_transient_storage_fn get_transient_storage;
    evmc_set_transient_storage_fn set_transient_storage;
};

Within OrbiterSDK, we use evmc_execute as following:

          evmc::Result result (evmc_execute(this->vm_, &this->get_interface(), this->to_context(),
          evmc_revision::EVMC_LATEST_STABLE_REVISION, &msg, recipientAcc.code.data(), recipientAcc.code.size()));

The ContractHost is casted into a evmc_host_interface to provide the VM with the necessary state access functions. This allows the VM to interact with the state and execute contract calls within the OrbiterSDK environment.

Managing State Changes: ContractStack in ContractHost

Within the ContractHost, each time a transaction or contract execution alters any state variables—such as creating a new contract, updating a variable, or initiating transfers—it is imperative for the ContractHost to engage the ContractStack. The primary function of the ContractStack is to maintain a record of the original states of these variables. This record-keeping is essential for enabling a complete restoration of the original state in the event of a transaction rollback.

ContractStack Class Overview

The ContractStack class serves a crucial role in safeguarding blockchain integrity by preserving the initial state of variables during modifications that occur throughout contract execution. This capability ensures that any adverse changes can be undone, maintaining the blockchain's consistency and reliability.

Here's an overview of the ContractStack's definition and functionalities:

class ContractStack {
  private:
    std::unordered_map<Address, Bytes, SafeHash> code_;
    std::unordered_map<Address, uint256_t, SafeHash> balance_;
    std::unordered_map<Address, uint64_t, SafeHash> nonce_;
    std::unordered_map<StorageKey, Hash, SafeHash> storage_;
    std::vector<Event> events_;
    std::vector<Address> contracts_; // Contracts that have been created during the execution of the call, we need to revert them if the call reverts.
    std::vector<std::reference_wrapper<SafeBase>> usedVars_;

  public:
    ContractStack() = default;
    ~ContractStack() = default;

    inline void registerCode(const Address& addr, const Bytes& code)  {
      if (!this->code_.contains(addr)) {
        this->code_[addr] = code;
      }
    }

    inline void registerBalance(const Address& addr, const uint256_t& balance) {
      if (!this->balance_.contains(addr)) {
        this->balance_[addr] = balance;
      }
    }

    inline void registerNonce(const Address& addr, const uint64_t& nonce) {
      if (!this->nonce_.contains(addr)) {
        this->nonce_[addr] = nonce;
      }
    }

    inline void registerStorageChange(const StorageKey& key, const Hash& value) {
      if (!this->storage_.contains(key)) {
        this->storage_[key] = value;
      }
    }

    inline void registerEvent(Event&& event) {
      this->events_.emplace_back(std::move(event));
    }

    inline void registerContract(const Address& addr) {
      this->contracts_.push_back(addr);
    }

    inline void registerVariableUse(SafeBase& var) {
      this->usedVars_.emplace_back(var);
    }

    /// Getters
    inline const std::unordered_map<Address, Bytes, SafeHash>& getCode() const { return this->code_; }
    inline const std::unordered_map<Address, uint256_t, SafeHash>& getBalance() const { return this->balance_; }
    inline const std::unordered_map<Address, uint64_t, SafeHash>& getNonce() const { return this->nonce_; }
    inline const std::unordered_map<StorageKey, Hash, SafeHash>& getStorage() const { return this->storage_; }
    inline std::vector<Event>& getEvents() { return this->events_; }
    inline const std::vector<Address>& getContracts() const { return this->contracts_; }
    inline const std::vector<std::reference_wrapper<SafeBase>>& getUsedVars() const { return this->usedVars_; }
};

The existence of only one instance of ContractStack per ContractHost, and its integration within the RAII framework of ContractHost, guarantees that state values are meticulously commited or reverted upon the completion or rollback of transactions.

This design prevents state spill-over between different contract executions, fortifying transaction isolation and integrity across the blockchain network.

This robust mechanism ensures that even in the dynamic and mutable landscape of blockchain transactions, the integrity and consistency of state changes are meticulously maintained, safeguarding against unintended consequences and errors during contract execution.

Facilitating Seamless CPP <-> EVM Integration

Achieving seamless integration between CPP and EVM contracts revolves around the uniformity in the encoding and decoding of arguments. By standardizing these processes, we ensure that calls between different contract types are handled efficiently without the need for separate mechanisms for each.

Determining Contract Types and Executing Calls

The ContractHost plays a critical role in distinguishing whether a contract is implemented in CPP or EVM and executing it accordingly. Below is an example illustrating how CPP contracts can invoke functions in other contracts, whether they are CPP or EVM:

    template <typename R, typename C, typename... Args>
    requires (!std::is_same<R, void>::value)
    R callContractFunction(
      BaseContract* caller, const Address& targetAddr,
      const uint256_t& value,
      R(C::*func)(const Args&...), const Args&... args
    ) {
      // 1000 Gas Limit for every C++ contract call!
      auto& recipientAcc = *this->accounts_[targetAddr];
      if (!recipientAcc.isContract()) {
        throw DynamicException(std::string(__func__) + ": Contract does not exist - Type: "
          + Utils::getRealTypeName<C>() + " at address: " + targetAddr.hex().get()
        );
      }
      if (value) {
        this->sendTokens(caller, targetAddr, value);
      }
      NestedCallSafeGuard guard(caller, caller->caller_, caller->value_);
      switch (recipientAcc.contractType) {
        case ContractType::EVM : {
          this->deduceGas(10000);
          evmc_message msg;
          msg.kind = EVMC_CALL;
          msg.flags = 0;
          msg.depth = 1;
          msg.gas = this->leftoverGas_;
          msg.recipient = targetAddr.toEvmcAddress();
          msg.sender = caller->getContractAddress().toEvmcAddress();
          auto functionName = ContractReflectionInterface::getFunctionName(func);
          if (functionName.empty()) {
            throw DynamicException("ContractHost::callContractFunction: EVM contract function name is empty (contract not registered?)");
          }
          auto functor = ABI::FunctorEncoder::encode<Args...>(functionName);
          Bytes fullData;
          Utils::appendBytes(fullData, Utils::uint32ToBytes(functor.value));
          Utils::appendBytes(fullData, ABI::Encoder::encodeData<Args...>(args...));
          msg.input_data = fullData.data();
          msg.input_size = fullData.size();
          msg.value = Utils::uint256ToEvmcUint256(value);
          msg.create2_salt = {};
          msg.code_address = targetAddr.toEvmcAddress();
          evmc::Result result (evmc_execute(this->vm_, &this->get_interface(), this->to_context(),
          evmc_revision::EVMC_LATEST_STABLE_REVISION, &msg, recipientAcc.code.data(), recipientAcc.code.size()));
          this->leftoverGas_ = result.gas_left;
          if (result.status_code) {
            auto hexResult = Hex::fromBytes(BytesArrView(result.output_data, result.output_data + result.output_size));
            throw DynamicException("ContractHost::callContractFunction: EVMC call failed - Type: "
              + Utils::getRealTypeName<C>() + " at address: " + targetAddr.hex().get() + " - Result: " + hexResult.get()
            );
          }
          return std::get<0>(ABI::Decoder::decodeData<R>(BytesArrView(result.output_data, result.output_data + result.output_size)));
        } break;
        case ContractType::CPP : {
          this->deduceGas(1000);
          C* contract = this->getContract<C>(targetAddr);
          this->setContractVars(contract, caller->getContractAddress(), value);
          try {
            return contract->callContractFunction(this, func, args...);
          } catch (const std::exception& e) {
            throw DynamicException(e.what() + std::string(" - Type: ")
              + Utils::getRealTypeName<C>() + " at address: " + targetAddr.hex().get()
            );
          }
        }
        default : {
          throw DynamicException("PANIC! ContractHost::callContractFunction: Unknown contract type");
        }
      }
    }

EVM -> Another Contract Calls

When an EVM contract needs to invoke another contract, the ContractHost::call() function manages the interaction. This overridden function is designed to handle both CPP and EVM contract calls as shown below:

evmc::Result ContractHost::call(const evmc_message& msg) noexcept {
  Address recipient(msg.recipient);
  auto &recipientAccount = *accounts_[recipient]; // We need to take a reference to the account, not a reference to the pointer.
  this->leftoverGas_ = msg.gas;
  /// evmc::Result constructor is: _status_code + _gas_left + _output_data + _output_size
  if (recipientAccount.contractType == CPP) {
    // Uh we are an CPP contract, we need to call the contract evmEthCall function and put the result into a evmc::Result
    try {
      this->deduceGas(1000); // CPP contract call is 1000 gas
      auto& contract = contracts_[recipient];
      if (contract == nullptr) {
        throw DynamicException("ContractHost call: contract not found");
      }
      this->setContractVars(contract.get(), Address(msg.sender), Utils::evmcUint256ToUint256(msg.value));
      Bytes ret = contract->evmEthCall(msg, this);
      return evmc::Result(EVMC_SUCCESS, this->leftoverGas_, 0, ret.data(), ret.size());
    } catch (std::exception& e) {
      this->evmcThrows_.emplace_back(e.what());
      this->evmcThrow_ = true;
      return evmc::Result(EVMC_PRECOMPILE_FAILURE, this->leftoverGas_, 0, nullptr, 0);
    }
  }
  evmc::Result result (evmc_execute(this->vm_, &this->get_interface(), this->to_context(),
           evmc_revision::EVMC_LATEST_STABLE_REVISION, &msg,
           recipientAccount.code.data(), recipientAccount.code.size()));
  this->leftoverGas_ = result.gas_left; // gas_left is not linked with leftoverGas_, we need to link it.
  this->deduceGas(5000); // EVM contract call is 5000 gas
  result.gas_left = this->leftoverGas_; // We need to set the gas left to the leftoverGas_
  return result;
}

Streamlining Calls with evmc_message

We have deprecated EthCallInfo and EthCallInfoAllocated in favor of using the evmc_message struct, aligning the call structures between CPP and EVM environments. This uniformity simplifies the interaction framework and reduces the potential for errors and data mismanagement:

struct evmc_message
{
    enum evmc_call_kind kind; // The kind of the call. 
    uint32_t flags;
    int32_t depth;
    int64_t gas;
    evmc_address recipient;
    evmc_address sender;
    const uint8_t* input_data;
    size_t input_size;
    evmc_uint256be value;
    evmc_bytes32 create2_salt;
    evmc_address code_address;
};

Rationale Behind Deprecating ContractManager in Favor of ContractHost/ContractStack

The transition from ContractManager to the ContractHost/ContractStack model was driven by a need for more robust execution safety and context management:

This restructuring not only streamlines the execution flow but also significantly enhances the safety and reliability of contract executions within our platform.

See the ContractHost destructor for more information:

TODO: Sorry but I'm still editting the ContractHost class, and the destructor will change lol.

Calling EVM Contracts from C++

To invoke EVM contract functions from C++, we leverage a templated approach that mimics the contract's functions in a C++ class. This method provides a type-safe way to interact with contracts written in Solidity or other EVM-compatible languages.

Implementation Steps:

class SolMyContract {
public:
    uint256_t myFunction(const uint256_t& arg1, const uint256_t& arg2) const {};
    static void registerContract() {
        ContractReflectionInterface::registerContractMethods<SolMyContract>(
            std::vector<std::string>{},  // List of dependencies or related artifacts if any
            std::make_tuple("myFunction", &SolMyContract::myFunction, FunctionTypes::View, std::vector<std::string>{"arg1", "arg2"})
        );
    }
};
    uint256_t AnotherContract::callMyFunction(const Address& targetAddr, const uint256_t& arg1, const uint256_t& arg2) const {
       SolMyContract::registerContract();  // Ensure the EVM contract's methods are registered (Can be done only a single time in the constructor)
       return this->callContractViewFunction<SolMyContract>(this, targetAddr, &SolMyContract::myFunction, arg1, arg2);
    }

Calling C++ Contracts from EVM

Calling a C++ contract from an EVM contract uses a standard Solidity interface to abstract the C++ implementation. This approach ensures that calls from EVM to C++ are as straightforward as EVM-to-EVM calls.

Implementation Steps:

interface MyContract {
    function myFunction(uint256 arg1, uint256 arg2) external view returns (uint256);
}
contract AnotherContract {
    function callMyFunction(address cppAddr, uint256 arg1, uint256 arg2) public view returns (uint256) {
        return MyContract(cppAddr).myFunction(arg1, arg2);
    }
}

Future Enhancements:

Conclusion

This pull request represents a significant milestone in the evolution of our platform, marking the successful integration of CPP and EVM contract functionalities within the OrbiterSDK environment. By standardizing the encoding and decoding mechanisms and streamlining the interaction between different contract types, we have laid a robust foundation for seamless and efficient contract execution.

The introduction of the ContractHost class as the central executor for both CPP and EVM contracts facilitates a uniform handling of contract calls, enhancing our system's scalability and flexibility. The adoption of evmc_message struct across the platform simplifies our architecture, reduces overhead, and minimizes the potential for errors during contract execution. Key Achievements: