ethereum / solidity

Solidity, the Smart Contract Programming Language
https://soliditylang.org
GNU General Public License v3.0
23.03k stars 5.7k forks source link

Requesting bytecode/IR for one contract results in it being generated for all selected contracts #15373

Open klkvr opened 2 weeks ago

klkvr commented 2 weeks ago

Description

I am attaching 2 standard JSON input files. Both of them include two sources. One of them (src/contract.sol) is a Uniswap V3 position manager (can be any contract that takes at least some time to compile). Second is a contract containing a single import directive, importing the heavy contract (import "src/contract.sol";)

In first input, bytecode output selection is requested for both contracts, and in second input - only for the second one.

input1.json input2.json

Expected behavior

I would expect solc to skip compiling contracts which are excluded from output selection unless they appear as a bytecode dependency (new keyword or .creationCode) in some of the included contracts.

Actual behavior

It seems that both inputs take a similar pretty long time to compile event though second input basically just compiles an empty source, and outputs nothing. I assume it spends time compiling imported heavy contract

$ time cat input1.json | solc --standard-json
solc --standard-json  3.49s user 0.09s system 96% cpu 3.707 total
$ time cat input2.json | solc --standard-json
solc --standard-json  3.49s user 0.07s system 99% cpu 3.597 total

Fixing this would mean a potentially very high compile time optimization. Right now, even if tooling resolves contract dependecies, and excludes all of them from output selection, solc still spends time compiling and optimizing unused dependencies, even though the resulted bytecode is not outputted at all.

Environment

cameel commented 2 weeks ago

I can confirm this, but it does not really seem to be a bug, it's more of a limitation of the current design. Not all outputs are generated on-demand - some of them, especially the heavier ones like IR or bytecode, require some upfront work and that work is done in bulk for all contracts even if the output is requested only for one of them.

https://github.com/ethereum/solidity/blob/40e005beed91fceaa72afcba0b063462b139270d/libsolidity/interface/StandardCompiler.cpp#L283-L299

So this has nothing to do with the import in your example, but rather with your output selection:

input1.json:

"outputSelection": {
    "src/importer.sol": {
        "*": [
            "abi",
            "evm.bytecode",
            "evm.deployedBytecode",
            "evm.methodIdentifiers",
            "metadata"
        ]
    },
    "src/contract.sol": {
        "*": [
            "abi",
            "evm.bytecode",
            "evm.deployedBytecode",
            "evm.methodIdentifiers",
            "metadata"
        ]
    }
},

input2.json:

"outputSelection": {
    "src/importer.sol": {
        "*": [
            "abi",
            "evm.bytecode",
            "evm.deployedBytecode",
            "evm.methodIdentifiers",
            "metadata"
        ]
    },
    "src/contract.sol": {
        "*": []
    }
},

Both of these currently will perform the same work, the second one will just not output some of the results of that work. That's because src/importer.sol requires bytecode generation and therefore bytecode will be generated for all contracts.

I agree that this is not a great design and we should change it. It's technically possible to generate bytecode only for the contracts you request, it will just require changing the way the artifact selection is done in general, which might take some tedious refactoring.

cameel commented 2 weeks ago

Fixing this would mean a potentially very high compile time optimization. Right now, even if tooling resolves contract dependecies, and excludes all of them from output selection, solc still spends time compiling and optimizing unused dependencies, even though the resulted bytecode is not outputted at all.

By the way, do you know of any tool that is actually impacted by it in practice? AFAIK tools typically request the same artifacts for all contracts, which is probably why no one brought this issue up so far. This does have potential to speed things up a lot but only if the compiler is actually being used that way. I'd prioritize this issue higher if you have some evidence that it actually is.

klkvr commented 2 weeks ago

By the way, do you know of any tool that is actually impacted by it in practice? AFAIK tools typically request the same artifacts for all contracts, which is probably why no one brought this issue up so far. This does have potential to speed things up a lot but only if the compiler is actually being used that way. I'd prioritize this issue higher if you have some evidence that it actually is.

In Foundry, we erase output selection for contracts which are cached: https://github.com/foundry-rs/compilers/blob/c16927bf601e464aa8765d31c88bf8ffe5285be6/crates/compilers/src/cache.rs#L661 https://github.com/foundry-rs/compilers/blob/c16927bf601e464aa8765d31c88bf8ffe5285be6/crates/compilers/src/filter.rs#L145

cameel commented 2 weeks ago

Ok then, in that case it seems worth taking care of sooner rather than later.