ethereum / solidity

Solidity, the Smart Contract Programming Language
https://soliditylang.org
GNU General Public License v3.0
23.3k stars 5.77k forks source link

Feature Request: User-provided ABI for --standard-json (Yul only) #14392

Open chjj opened 1 year ago

chjj commented 1 year ago

Currently, the "abi" output selector is a no-op when language=Yul.

This is a problem for yul source verification: while blockchain explorers like etherscan and blockscout support standard-json input for verification, they choke if there is no ABI present in the output (though, as of blockscout#6444, it looks like blockscout does support yul verification, despite the lack of abi).

Since contracts written in plain Yul have no way to specify an ABI in any manner recognizable to the compiler, it would be nice if solc accepted abi as a standard-json input field and simply spit it out on the other side when compiling with language=Yul (maybe also validate the json to ensure it matches the schema of a json abi object).

By doing this there's some implications (potential attacks) for explorers that allow source verification via standard-json: e.g. an attacker compiles their solidity contract to yul and source-verifies the very hard-to-read IR instead of the actual solidity code. That said, a similar obfuscation technique is already possible by writing a solidity contract and wrapping the body of every external function inside an assembly block (albeit a bit more work).

I might be barking up the wrong tree with this, as it would be "cleaner" to simply beseech the aforementioned explorers to update their API and allow for some kind of abi parameter, but adding this very small feature to the compiler itself would transparently fix the problems with all of the explorer sites currently deployed, allowing for Yul source verification.

I suppose this also brings up an ethical concern: this may qualify as a kind of surreptitious soft-fork for explorers, which perhaps adds an undesired feature without their knowledge. Though, they can always explicitly blacklist language=Yul if they want to.

In any case, another major benefit of this is that newer languages which compile to Yul do not need to wait for explorer integration. This includes projects like fe, which uses solc as a backend for the final yul->bytecode codegen step.

chjj commented 1 year ago

Closed this for a while because I had to think about it more.

Perhaps when an ABI is provided to the compiler for yul compilation, the compiler appends a keccak256 hash of the serialized JSON object to the deployed bytecode (preceding the ".metadata" if there is any). This prevents an attacker from providing a false ABI to blockchain explorers.

chjj commented 1 year ago

Example standard-json input:

{
  language: 'Yul',
  sources: {
    'MyContract.yul': {
      urls: ['/path/to/MyContract.yul']
    }
  },
  settings: {
    outputSelection: {
      '*': {
        '*': ['evm.bytecode', 'evm.deployedBytecode', 'abi']
      }
    },
    optimizer: { enabled: true, runs: 200, details: { yul: true } },
    abi: {
      'MyContract': [
        {
          type: 'function',
          name: 'createSubContract',
          stateMutability: 'nonpayable',
          inputs: [],
          outputs: [{ internalType: 'address', type: 'address', name: '' }]
        },
        {
          type: 'function',
          name: 'addTwoNumbers',
          stateMutability: 'view',
          inputs: [
            { internalType: 'uint32', type: 'uint32', name: 'x' },
            { internalType: 'uint32', type: 'uint32', name: 'y' }
          ],
          outputs: [{ internalType: 'uint32', type: 'uint32', name: '' }]
        }
      ],
      'MyContract.Runtime.SubContract': [
        {
          type: 'function',
          name: 'addThreeNumbers',
          stateMutability: 'view',
          inputs: [
            { internalType: 'uint32', type: 'uint32', name: 'x' },
            { internalType: 'uint32', type: 'uint32', name: 'y' },
            { internalType: 'uint32', type: 'uint32', name: 'z' }
          ],
          outputs: [{ internalType: 'uint32', type: 'uint32', name: '' }]
        }
      ]
    }
  }
}

For our contract:

object "MyContract" {
  code {
    datacopy(0, dataoffset("Runtime"), datasize("Runtime"))
    return(0, datasize("Runtime"))
  }
  object "Runtime" {
    code {
      switch shr(224, calldataload(0))
      // function createSubContract()
      //   external returns (address);
      case 0x6161f5c2 {
        datacopy(0, dataoffset("SubContract"), datasize("SubContract"))
        mstore(0, create(0, 0, datasize("SubContract")))
        return(0, 32)
      }
      // function addTwoNumbers(uint32 x, uint32 y)
      //   external view returns (uint32);
      case 0xe6dc9682 {
        let x := calldataload(4)
        let y := calldataload(36)
        mstore(0, add(x, y))
        return(0, 32)
      }
      default {
        revert(0, 0)
      }
    }
    object "SubContract" {
      code {
        datacopy(0, dataoffset("SubRuntime"), datasize("SubRuntime"))
        return(0, datasize("SubRuntime"))
      }
      object "SubRuntime" {
        code {
          switch shr(224, calldataload(0))
          // function addThreeNumbers(uint32 x, uint32 y, uint32 z)
          //   external view returns (uint32);
          case 0x80b34c45 {
            let x := calldataload(4)
            let y := calldataload(36)
            let z := calldataload(68)
            mstore(0, add(add(x, y), z))
            return(0, 32)
          }
          default {
            revert(0, 0)
          }
        }
        // data ".abi" "0000000..." (added after compilation)
        data ".metadata" hex"c0ffee"
      }
    }
    // data ".abi" "0000000..." (added after compilation)
    data ".metadata" hex"deadbeef"
  }
}

Where the data ".abi" section is (conceptually) added post-compilation.

0xm4ud commented 1 year ago

I would like to see that implemented, there is no reason for the output selector currently being no-op for Yul, very good example above.