Fix BytecodeObject validity requirements

gnidan commented 6 years ago

Noticed a couple issues with BytecodeObject's definition, or inconsistencies with the natural language spec:

BytecodeObject is defined in the JSON-Schema to require either bytecode or link_dependencies, but not both
The natural language spec indicates that bytecode is always required, and link_dependencies is never required.

Presumably, link_dependencies is indicated as required in the JSON-Schema because it is necessary for ContractInstanceObject definitions. This means that the spec expects to see link_dependencies: [] when no link references are present. Similarly for contract instances, bytecode can be omitted, inheriting from the contract type.

I propose that we change these definitions a bit, splitting BytecodeObject into the non-polymorphic:

UnlinkedBytecodeObject
LinkedBytecodeObject

UnlinkedBytecodeObject would require both bytecode and link_references (empty array for no links). LinkedBytecodeObject would formally require only link_dependencies, but semantically require bytecode and link_references if and only if they differ from inherited values.

We would then update ContractType and ContractInstance to use unlinked / linked forms, respectively.

Thoughts? I opened a PR with formal schema changes, linked below.

gnidan commented 6 years ago

Oh, another possibly open question:

Does a contract type that is an alias for another type need to include bytecode?

pipermerriam commented 6 years ago

Does a contract type that is an alias for another type

It's not clear to me what you mean by this.

gnidan commented 6 years ago

@pipermerriam Maybe I don't understand the use case for Contract Aliases. Presumably the purpose is to define multiple contract types for various "singleton" roles? So one contract type would be canonical, the others being aliases. In such a situation, bytecode can potentially be omitted for alias types.

njgheorghita commented 6 years ago

Yeah ok, I think I like this proposal, but i'm not yet sold that creating more explicit bytecode classes is a useful distinction, or an unnecessary complication. Do the benefits outweigh just correcting/updating the natural language spec?

This means that the spec expects to see link_dependencies: [] when no link references are present.

But wouldn't there be an accompanying bytecode so the spec doesn't expect to see (aka require) link_dependencies at all?

but semantically require bytecode and link_references if and only if they differ from inherited values.

What do you mean by semantically? It's just up to the manifest creator to include these fields if they differ?

gnidan commented 6 years ago

Yeah ok, I think I like this proposal, but i'm not yet sold that creating more explicit bytecode classes is a useful distinction, or an unnecessary complication. Do the benefits outweigh just correcting/updating the natural language spec?

I'm proposing explicit bytecode classes because conflating linked and unlinked doesn't really provide any benefit:

In all usages of BytecodeObject, it's clear that the bytecode is only going to be of one form or another... i.e., ContractTypes will always specify unlinked bytecodes, and ContractInstances will always specify linked.
On implementation side, reading the schema, having them as one object means additional runtime checks.

We'll still need to update the natural language spec, but it's a simpler explanation if we just define two separate types of bytecode objects, rather than use prose to explain the "type" vs. "instance" matrix of when bytecode is necessary vs. when link_dependencies is necessary.

What do you mean by semantically? It's just up to the manifest creator to include these fields if they differ?

I mean that JSON-Schema isn't expressive enough to explain the requirement that "bytecode is required if and only if it differs from the corresponding contract_type". But English clearly is that expressive. :)

(There are a couple places where the natural language spec expands on the definition of "valid" provided by the formal spec—I'm just proposing to make this a part of the standard the best way we can.)

njgheorghita commented 6 years ago

@gnidan Sold

pipermerriam commented 6 years ago

@gnidan nick and I talked about this a little this morning. In general I'm :+1: but I'm having trouble evaluating this in the context of the overall spec and the JSON-schema PR doesn't provide much high level clarity either. What do you think about going ahead and generating a PR to update the natural language spec? Since that is the canonical source of truth, I think it would be easier for us to give a final :+1: :-1: from there and then the JSON-schema update naturally follows.

gnidan commented 6 years ago

@pipermerriam Certainly! I've been approaching this from the perspective of making machine-consumption of the JSON-Schema easier, but I'll get the corresponding natural language spec updated so you can make more sense of it. Thanks!

ethpm / ethpm-spec

Fix BytecodeObject validity requirements #123