Update bytecode object in docs

njgheorghita commented 6 years ago

Natural language spec update for proposed #123

Hope I'm understanding this correctly, very open to all suggested changes/feedback to improve understanding/clarity.

pipermerriam commented 6 years ago

this is great, because now I can see what I think is a problem... As it stands:

contract instances: linked bytecode
contract type: unlinked bytecode

I think this ?unintentionally? removes the option to ship contract types with their bytecode pre-linked. It may not be explicitely stated, and it probably should be explicitely stated but I think this is an important use case to support. Without this, it becomes impossible to ship ready-to-use contracts which make use of libraries.

Am I missing something?

gnidan commented 6 years ago

I think this ?unintentionally? removes the option to ship contract types with their bytecode pre-linked. It may not be explicitely stated, and it probably should be explicitely stated but I think this is an important use case to support. Without this, it becomes impossible to ship ready-to-use contracts which make use of libraries.

I'd say that if a package doesn't want to expose already-linked types, it can just say link_references: []. This doesn't change the underlying unlinkedness of the object, it just means that a particular piece of bytecode doesn't have any public link references.

I agree though that this should be explicitly stated: "if you want to link your contracts and not tell EthPM about it, just say there are no link references." I don't think this requires any special consideration, in the form of a polymorphic bytecode object.

njgheorghita commented 6 years ago

I'd say that if a package doesn't want to expose already-linked types, it can just say link_references: [].

I might be missing something, but does this cover if a contract type does want to expose an already-linked library (maybe for validation purposes), yet still have unlinked bytecode/link references?

gnidan commented 6 years ago

I might be missing something, but does this cover if a contract type does want to expose an already-linked library (maybe for validation purposes), yet still have unlinked bytecode/link references?

Hm, for validation purposes? This is probably only covered via the use of the compiler settings property, to indicate that specific link values were provided to the compiler ahead of time.

njgheorghita commented 6 years ago

@pipermerriam Added link_dependencies option to UnlinkedBytecodeObject. @gnidan I'd appreciate a final look over to make sure everything looks good

gnidan commented 6 years ago

So for contract types that have pre-linked value, does that bytecode still have zeroes?

If that's true, then it requires additional processing pre-publish, to convert them to zeroes.

If it's not true, well, that's another required spec update.

gnidan commented 6 years ago

Another thing: what if a contract only wants to pre-link some of its link refs? So then link_dependencies would have a subset of values and not have the same array length as link_references.

njgheorghita commented 6 years ago

Feels weird to convert the linked bytecode value to 0s.

Can we just create a new "type" of link value?

njgheorghita commented 6 years ago

Is it a requirement that link_dependencies and link_references are the same length? As long as all link_dependencies reference a link_reference, it should be fine to have link_references that don't reference a link_dependencies? 🤷‍♂️

gnidan commented 6 years ago

And we can change the zeroes requirement from must to should, that way we can try to get the tools to zero out pre-fill values, but it's not entirely necessary, really.

But now I'm unsure of the distinction that types are unlinked and instances are linked. Is there even a useful distinction there anymore? All we know is that instances must have values for all link references.

njgheorghita commented 6 years ago

It also might be somewhat confusing to have an UnlinkedBytecodeObject that can actually be (partially) linked in certain cases. A better name could help though

cbdotguru commented 6 years ago

Edited for updates brought up in gitter:

So after reading through this and the spec and my code over and over, here's what seems to make sense in my brain. We go back to a single bytecode object (this is where offsets live anyway) and require all links be listed in a mapping with labels. Then these labels could be referred to in an offset mapping. These links can then be one of three types: linked, literal, or reference. Then from these types we can determine the rest of the information as follows:

bytecode_object: {
  bytecode: [string always required],
  links: {
    [int defining offset]: [string as a link label]
  },
  link_objects: {
     [link label string]: {
        type: ['linked','literal','reference'],
        // if 'linked' then value is a name, if 'literal' then value is the literal string, if 'reference' then value is the reference name
        value: [required string], 
        length: [required int if type is 'linked'],
      } 
    }
  }
}

Then, in the spec, we could make a should recommendation that if one is linked then all should be linked.. OR we could just make it a must. So we have a single polymorphic object, and we reduce redundancy and complexity I think, then our core tool could enforce that if you're providing a linked bytecode, then we're enforcing it to be complete.

We could additionally require that a bytecode object within a contract_instance have only linked bytecode. Then contract_type’s could be either/or.

Also, this would mean that both linked and reference use the name to a contract object, it’s just that if type is linked the bytecode has an embedded link and if it’s reference the bytecode has 0’s.

ethpm / ethpm-spec

Update bytecode object in docs #126