Decide what to do about swarm metadata

JakeOShannessy commented 5 years ago

Currently this issue is ignored by the kernel validator. The kernel validator will often find non-permitted opcodes in the swarm metadata at the end of a contract, even though they are not supposed to be used as opcodes. They are not supposed to be executed and be in a non-reachable part of the code, but in order to allow it we need to prove it is unreachable.

Currently the beaker-preprocessor parses this swarm metadata and throws it away. This is currently causing issues in the beakeros repository because we are testing the validator on contracts compiled directly from Solidity, which includes the swarm metadata. To demonstrate that it is unreachable on-chain might be difficult, I think we should currently not accept swarm metadata in procedures.

This means we need to remove the swarm metadata from the beakeros tests, which would depend on using the beaker-preprocessor. The beakeros repo might need to depend on beaker-preprocessor.

Latrasis commented 5 years ago

Are there any options for turning off metadata when compiling with solc?

JakeOShannessy commented 5 years ago

I don't think so, there doesn't seem to be any easily exposed option.

Latrasis commented 5 years ago

Request for feature: https://github.com/ethereum/solidity/issues/4853

chriseth commented 5 years ago

I'm not sure about the context of this issue but note that the metadata is not the only data that is stored in the code. Due to the presence of the codecopy opcode, we assume that we can use code as data.

JakeOShannessy commented 5 years ago

Yes correct, fortunately it's only the reverse case we are concerned about, where data is used as code. Since the swarm metadata is in the code it could theoretically be executed.

As the swarm metadata consists of a hash it's possible that is randomly contains dangerous code which we would like to reject. In general the only way to prove swarm data hasn't been maliciously altered is to remove it.

It is possible to verify that the swarm metadata corresponds to the current contract, but that is a lot more complex procedure.

chriseth commented 5 years ago

I'm not really sure if it is better to discuss this here or in the solidity repo.

Ok, this means everything is fine as long as you can do a full control flow analysis, is that correct? This should not be too hard unless you store internal function pointers in storage.

Concerning the fact that removal of metadata only removes one kind of data and the issue that metadata is a security feature, IMHO I don't think that an option to remove metadata really helps in this case.

JakeOShannessy commented 5 years ago

I guess this pertains to exactly what we're doing so I'll respond here.

One of our goals is to avoid exactly that full control code analysis, particularly as it would require preventing things like function pointers from storage or message data. We want to be able to demonstrate certain properties trivially.

It does only remove one kind of data, but that data is random (in the hash sense) and executable. Control flow analysis could demonstrate it's unreachable, but that is a more complicated proof, and requires excluding otherwise valid contracts. For contracts compiled from Solidity it shouldn't be reachable anyway, but we're looking at bytecode and don't have that information.

Daohub-io / beaker-cli

Decide what to do about swarm metadata #20