ethereum / beacon-APIs

Collection of RESTful APIs provided by Ethereum Beacon nodes
https://ethereum.github.io/beacon-APIs/
Creative Commons Zero v1.0 Universal
328 stars 167 forks source link

clarify the consensus version headers #470

Closed mehdi-aouadi closed 1 day ago

mehdi-aouadi commented 2 weeks ago

The Eth-Consensus-Version header usage is ambiguous and could be misinterpreted. It shouldn't be tied up to the data schema but rather to the consensus version/fork the data belongs to. For example since EIP7549, there are 2 possible attestations schemas only: Attestation (phase0) and Electra.Attestation (electra) but that doesn't mean that the Eth-Consensus-Version can only be phase0 or electra but rather any value according to when the attestation is submitted or retrieved (Example: a deneb attestation will have an Attestation schema and deneb as value in the Eth-Consensus-Version). When it come to the block related APIs, the problem is less concerning because there is schema per fork and hence any value could be used.

tbenr commented 2 weeks ago

To me this rephrasing is still ambiguous. For example The consensus version of the attestations can be still interpreted as phase0 all the way to electra when it changes. Not sure how to change that to be more clear, but, to me, this header is more like "consensus version (aka fork) the data refers to", in other words "the active consensus fork at the time associated with the data".

maybe we can have a more extensive description in beacon-node-oapi.yaml (and even remove the description on each field).

if we give up with local description we could avoid duplication and reference the header like

parameters:
  - $ref: '../../../beacon-node-oapi.yaml#/components/headers/Eth-Consensus-Version'

WDYT?

mcdee commented 2 weeks ago

It seems that the issue is that the value in the header isn't the enum as specified by ConsensusVersion, but instead is one of a set of values that is object-dependent. As such, it would make sense to re-specify these explicitly in each endpoint e.g.

        headers:
          Eth-Consensus-Version:
            type: string
            enum: [phase0, electra]
            example: phase0
            description: "The consensus version of the attestations being submitted"

This also means that if in fulu we have further changes to some, but not all, of these objects then we can continue to provide the correct version information for each independently in the API.

mehdi-aouadi commented 2 weeks ago

It seems that the issue is that the value in the header isn't the enum as specified by ConsensusVersion, but instead is one of a set of values that is object-dependent. As such, it would make sense to re-specify these explicitly in each endpoint e.g.

        headers:
          Eth-Consensus-Version:
            type: string
            enum: [phase0, electra]
            example: phase0
            description: "The consensus version of the attestations being submitted"

This also means that if in fulu we have further changes to some, but not all, of these objects then we can continue to provide the correct version information for each independently in the API.

I like the approach of explicitly specifying the possibles values. But your example challenges to current interpretation: Starting from Electra we will deal with phase0 and electra attestations only (two schemas only) but the Eth-Consensus-Version header could have any value based on the fork the attestation relates to. Example: A deneb attestation will have a phase0 schema Attestation and the value deneb in the Eth-Consensus-Version header. Based on your example, the Eth-Consensus-Version header should have only possible values instead: phase0 or electra

tbenr commented 2 weeks ago

@mcdee your answer is the proof of that there is still confusion here your object-dependent interpretation of the header was mine initial one as well. But seems like clients are handling it like "the active fork\version at the time referenced by the object".

Do we want to "leak" the underlying object versioning in the api or do we want to uniformly deal with the consensus fork and let clients map fork->object-schema?

this applies on both requests and responses.

ensi321 commented 2 weeks ago

Agree that there's still some ambiguity in the changes proposed by this PR.

The table in getLightClientUpdatesByRange is easy to understand. I think we need something like this for consensus version header. Ideally more systematic and less mouthy way

nflaig commented 2 weeks ago

Do we want to "leak" the underlying object versioning in the api or do we want to uniformly deal with the consensus fork and let clients map fork->object-schema?

How do clients determine the what consensus version to set for block and state apis currently? As for Lodestar, we look at the slot of the data, calculate the epoch (slot / SLOTS_PER_EPOCH) and return the fork name to which the epoch belongs. This means we will set the fork the data belongs to irrespective of the object / container, it just happens to be the case that state and block objects changed each fork.

If we wanna go with container version instead I think we should rename the header to reflect that.

tbenr commented 2 weeks ago

If we wanna go with container version instead I think we should rename the header to reflect that.

I thought about that, but I think we can still make the case that data structure versioning is based on consensus versioning (we only change those on new forks) so we could leave the same name.

rolfyone commented 4 days ago

I'm not sure as at today we've got any object that returns a consensus version other than the current milestone that the slot of the object...

Eg. if i select a block from capella, the eth-consensus-version is saying 'capella'.

I'm not sure it's a great idea to have the concept that we return phase0 for attestations now in deneb for an attestation created on a deneb slot... I don't see the value? Every single client needs to be able to interpret this consistently and i think making a complicated decoding like this is going to not make anyones life any easier at all... Do we do this anywhere currently?

The whole point of eth-consensus-version was to give a starting point if we're encoding with SSZ to let the decoder understand the state of play for the milestone to decode the SSZ for...

tbenr commented 3 days ago

I'm ok with going in the direction of having eth-consensus-version as the active fork at the time (slot) referred by the object. And then let the client resolve the fork->object_fork. Looks like a simplification on the header handling.

But we need to be clear in the descriptions otherwise a guy knowing anything about what we are discussing, looking at:

image

he may say: why I see 6 options in the version and only 2 in anyOf?

tbenr commented 3 days ago

So IMO we should:

  1. make super clear that eth-consensus-version header and version in json are the same thing.
  2. they are not version of the object but instead the consensus version that has generated the object (ie active fork). So client has to remap it to the actual object version when deserializing (json or ssz)
mehdi-aouadi commented 3 days ago
  1. make super clear that eth-consensus-version header and version in json are the same thing.

This is already clear since the Eth-Consensus-Version schema points to ConsensusVersion which could have any fork value

  1. they are not version of the object but instead the consensus version that has generated the object (ie active fork). So client has to remap it to the actual object version when deserializing (json or ssz)

I pushed a commit with some rephrasing: The active consensus to which ... belongs

rolfyone commented 3 days ago

I just really think we're overthinking all of this. but as long as we're not suggesting that an attestation produced in a deneb slot comes out with Eth-consensus-version of phase0 as earlier suggested, then im not completely against changing it.

Just be aware that with so many different languages and backgrounds, unless we start using formal notation this will remain ambiguous to some people.

rolfyone commented 3 days ago

@tbenr @mcdee @nflaig interested if you believe the current changes by mehdi address concerns raised here, or if there is more to be done. I'm happy to merge as is, but will leave it open and unmerged if there's more to be done.

mcdee commented 2 days ago

Its one of those things that is obvious for everyone here, but can be seen as confusing for someone coming into this fresh. I think that we're okay with the wording as proposed in this PR...

...except a couple of the headers have ", if using SSZ encoding." in their description. I believe that this is being sent, or at least should be sent, for the response regardless of the encoding. If this is the case, we should remove this bit as we're making a change anyway.

tbenr commented 2 days ago

Current rephrasing looks good! Just added another comment.