Feature Request: allow for input of pre-compiled bytecode

SilentCicero commented 7 years ago

Right now the solc compiler can only intake the sources of solidity files/source maps. But when I'm compiling more than say, 15 contracts (which I do often, as many of us use the EVM/solc as a testing environment). The compile time takes forever, sometimes up to 10+ minutes depending on the source. Many of the contracts I'm compiling have already been compiled (and have not changed). Often times, I may only be changing one or two at most. So I'd like to request a bytecode input feature (if something like this is possible and doable).

Usage:

var solc = require('solc');
var input = {
    'lib.sol': 'library L { function f() returns (uint) { return 7; } }',
    'cont.sol': 'import "lib.sol"; contract x { function g() { L.f(); } }',
};
var bytecode = {
    'cont.sol': '0x00..',
};
var output = solc.compile({ sources: input, bytecode: bytecode }, 1);

This way, if the bytecode is presented for a specific contract, then it won't recompile the source, instead it will just use the source provided.

Optimizations like this allow us to have a faster environment to develop large Solidity applications. This way, the build environment can optimize compile time by presenting bytecode that often times does not change.

Thoughts?

One con/Fix:

what if the bytecode is old and the contract has actually changed, but the build environment improperly handled the change, and did not recompile code properly. Potential Fix: instead of the bytecode map being "contract_name": "bytecode", it could be: "keccak256(contract_code)": "bytecode". This way, the map will only map to that contracts bytecode and nothing else. This would avoid a situation where bytecode is presented for an old contract code.

Note, I also posted this issue in solc-js repo, not sure which was more appropriate.

Georgi87 commented 7 years ago

I agree that compile time becomes an issue with multiple contracts/tests. All Gnosis tests run in total for almost 20 minutes. I would suggest adding a cache to the compiler, which could be enabled with a flag. This cache would map a hashed version of the source code to the bytecode:

hash(source code) => compiler version => bytecode

This would allow to compile only when there are changes and would not require any additional setup by the programmer.

Maybe this should be done in a preprocessor.

pipermerriam commented 7 years ago

This feature will also be very valuable for some of the upcoming ethereum package management stuff that's being worked on over at https://github.com/ethpm/epm-spec

In situations where a dependency is trusted it would be convenient to be able to just inject the pre-compiled asset that was shipped with the package into the compiler rather than forcing it to be recompiled.
In situations where a dependency is written in a different EVM language this would allow it to still be used by a solidity project.
In situations where a project is using a shiny new version of solidity this would allow it to use a dependency which requires compilation with an older incompatible version of solidity.

SilentCicero commented 7 years ago

@Georgi87 I really like mixing the compiler version as well. very nice!

@pipermerriam yes, this would speed up compile time for epm stuff for sure.

chriseth commented 7 years ago

There is currently no way to request only a specific output, the compiler will always generate the bytecode for every single contract it finds in the source. We are changing that when we switch to the new json-based input specification. If that does not help, supplying already compiled bytecode sounds like a useful though potentially dangerous feature.

SilentCicero commented 7 years ago

@chriseth proper hashing/mapping of the bytecode and contracts should help us prevent problems. As Stefan noted, we can tightly hash compiler version, contract data, etc to link and compare bytecode. Infact, we could hash (bytecode, source code, compiler version) etc, to make sure the link is air tight.

We are all having problems with the compile time of Solidity. It's a very cumbersome language to work with on actual platform like apps right now (i.e 15+ contracts dApps).

pipermerriam commented 7 years ago

With packaging right around the corner this feature would be extremely valuable as it fixes one fundamental problem that is going to show up soon. Currently I know of no viable solution for situations where there are conflicting compiler requirements across dependencies.

Ideally with this feature in place package managers can do the following.

Individually compile each dependency using the specified compiler information for that package.
Compile the project, using the assets produced from the previous step.

One thing that I think would need to be added with this feature is the ability to specify the ABI for the bytecode so that solidity can still do nice "type" checking.

chriseth commented 7 years ago

Note that just providing the bytecode for external files will not enable compilation. The compiler needs type information and that goes far beyond the ABI, think of internal library functions and structs being used from other contracts. If you want to separate compilation units across packages, I think the only viable solution now is to specify abstract interfaces - which should be a good software development practice anyway.

axic commented 7 years ago

@SilentCicero @pipermerriam assuming the new compiler interface will allow to specify what outputs are required from which inputs would that solve this issue? (See #1387.)

E.g. you'd still need to supply all sources required to compile a given output, but the compiler would only generate the bytecode for that single contract if that is what you instruct it to do.

VoR0220 commented 7 years ago

You wouldnt necessarily need all the source. You just need certain symbols and to craft interfaces that can be accessed from a certain address.

VoR0220 commented 7 years ago

Monax currently does this with native contracts that are specified at the address "permissions_contract".

chriseth commented 7 years ago

This whole issue assumes that there are inefficiencies at certain parts in the compiler. Let's first implement the selective compilation and then see if they are actually still there.

onbjerg commented 2 years ago

Hey, just putting my vote on this one.

Sometimes compiling tests in Forge can take forever if the optimizer is turned on, so we'd like the ability to split our compilation pipeline: first, compile non-test contracts with the optimizer on, and then compile the test contracts with the optimizer off, passing in the optimized bytecode for the non-test contracts.

We have a gas report feature that estimates the gas usage of different function calls using tracing, so most users probably want to compile their non-test contracts with the optimizer on, to get more realistic estimates.

The other benefit is that we can improve compilation speed in Forge as well.

Ref https://github.com/foundry-rs/foundry/issues/317

Would this be a hard feature to implement? If not, we'd be happy to try and take a look at some point, if this is something you want to add but do not have the bandwidth to.

Edit: I see there was some talk about the ABI not being enough for type information - this is true, but given the compiler can now be stopped at the parser stage(?) this might not be an issue anymore?

cameel commented 2 years ago

@onbjerg You might be interested in #11787. It's currently being implemented and will allow reimporting the EVM assembly output that the compiler can produce. Just to make sure - @aarlt - it will be possible to import unoptimized asm with --optimize so that it gets optimized, right?

Also, in case of the viaIR pipeline it's a bit more complicated because there are two optimization stages - Yul optimizer and then the EVM assembly optimization. You can already use the --ir compiler output to get the unoptimized Yul and later continue the compilation for each file by passing it to solc --strict-assembly --optimize.

aarlt commented 2 years ago

@cameel thx for linking #11787 here. @onbjerg Yes, it will be possible to import unoptimised asm with --optimize so that it gets optimized.

onbjerg commented 2 years ago

And I assume it would also be possible to include optimized bytecode in non-optimized bytecode that is not run through the optimizer?

aarlt commented 2 years ago

@onbjerg Yes, that should work. For your use-case the built-in function verbatim could be a interesting - see https://docs.soliditylang.org/en/v0.8.13/yul.html#verbatim.

github-actions[bot] commented 1 year ago

This issue has been marked as stale due to inactivity for the last 90 days. It will be automatically closed in 7 days.

github-actions[bot] commented 1 year ago

Hi everyone! This issue has been automatically closed due to inactivity. If you think this issue is still relevant in the latest Solidity version and you have something to contribute, feel free to reopen. However, unless the issue is a concrete proposal that can be implemented, we recommend starting a language discussion on the forum instead.

ethereum / solidity

Feature Request: allow for input of pre-compiled bytecode #1540