Open mcdee opened 1 year ago
Why not fetch the entire state via the debug endpoint as SSZ? Most of its SSZ serialized size is the validators list, and then its simple to produce a ValidatorResponse from the full state. Note that SSZ does not support strings so we would need to encode ValidatorStatus in some non-obvious way.
Given the current number of beacon nodes implementations I would prefer to offload data transformations to downstream tooling, and limit the beacon node's duty to expose all data necessary for such tools.
Yes that's an option, and one that I have played with (the size of state is broadly equivalent to the size of the validators response), but for the general-purpose consumer they won't know to do this. Unless we deprecate the /valdiators
endpoint entirely and point people to the /state
endpoint, but if we were to do that we'd probably want to move the /state
endpoint out of the debug
namespace.
Agree that we'd need to define the validator status, but figure that would be part of the endpoint definition.
Given the current number of beacon nodes implementations I would prefer to offload data transformations to downstream tooling, and limit the beacon node's duty to expose all data necessary for such tools.
This was certainly the initial intent (the /validators
endpoint was a bit contentious because it didn't expose a simple spec object) but seems to have been diluted over time with the addition of various other endpoints such as rewards. I don't have a problem with us still using that as a guiding principle for new endpoints, though.
A way I look at the debug / not distinction is:
So in that sense I think the /validators endpoint should be moved to debug since it can trigger a similar amount of load than the debug states. Also dumping the entire validators array falls into the "debugger / explorer / researcher" category. I'm not sure when that endpoint is of use for stakers
We have lots of endpoints that aren't required for validating, they sit in the "validator required" group. But we don't define their path based on that criterion.
I'm pretty sure that most endpoints could create a lot of load on beacon nodes if desired, including many that are required for validator operation. As such, I don't think that a rework of the hierarchy based on "potential load" would be a useful way of going. Also worth pointing out that the /validators
endpoint can take a list of validators for which to provide information, so it isn't always only used for dumping all validators but can be a subset of them.
I'm pretty sure that most endpoints could create a lot of load on beacon nodes if desired, including many that are required for validator operation. As such, I don't think that a rework of the hierarchy based on "potential load" would be a useful way of going
That's my main consideration regarding this issue. But I understand your points regarding usefulness. I don't want to speak for other implementations so I would like to know their take.
I think it wouldn't be too hard to support, we'd just need a canonical encoding for the validator status? Maybe UTF-8 in a List[Uint8; 32]
List[Byte; 32]
?
This reminds me of https://github.com/ethereum/consensus-specs/pull/2983 which I still intend to get back to at some point - basically, that PR is held up on deciding whether a ParticipationFlag
is a number / integer (it is not per its semantics, but the beacon api unfortunately defines it as such in its json encoding which is the sole exception we have so far).
Regarding List[Uint8; 32]
- there's risk for similar ambiguity here that should also be resolved: the beacon api assigns more or less meaning to byte vs uint8 (number) depending on whether you consider alias
to mean "introduces a new distinct type and encoding" vs "introduces a new name for the same type and encoding" - my interpretation is that alias
means what the dictionary says, ie new name only, but there is disagreement here. Long story short, uint8
would mean that the validator status is a number which is not quite right (you can't add two statuses): List[byte...]
on the other hand would mean "byte" which should correspond to a hex encoding on the json side.
+1 for byte, my Rust-brain sees them as identical, but I agree we should avoid any implication that it's a number
So for validator state we can define as follows:
Value | State |
---|---|
0x00 | Unknown |
0x01 | Pending initialized |
0x02 | Pending queued |
0x03 | Active ongoing |
0x04 | Active exiting |
0x05 | Active slashed |
0x06 | Exited unslashed |
0x07 | Exited slashed |
0x08 | Withdrawal possible |
0x09 | Withdrawal done |
Is this suitable?
So for validator state we can define as follows:
Value State 0x00 Unknown 0x01 Pending initialized 0x02 Pending queued 0x03 Active ongoing 0x04 Active exiting 0x05 Active slashed 0x06 Exited unslashed 0x07 Exited slashed 0x08 Withdrawal possible 0x09 Withdrawal done Is this suitable?
as long as we don't want to leave gaps or use byte type filters...
If we wanted to we could use bytes to our advantage...
I haven't given this a ton of thought but something like
0x0?
is not yet active
0x1?
is active in some capacity
0x2?
is exited onwards...
Thoughts? would allow checking 0x1?
as anything active, which is attractive but not sure if thats too old school...
Yes, I did consider this and it does have some advantages but I ended up discarding it. One possibility:
Bit | Meaning |
---|---|
0 | Is slashed |
1 | Has non-zero balance |
2 | Is pending |
3 | Is active |
4 | Is exiting |
5 | Is exited |
6 | Is withdrawable |
which I think would cover all the situations that we need, but still feels like it's a list of statuses. An alternative:
Bit | Meaning |
---|---|
0 | Is slashed |
1 | Has non-zero balance |
2 | Activation eligibility epoch <= request epoch |
3 | Activation epoch <= request epoch |
4 | Exit epoch <= request epoch |
5 | Withdrawable epoch <= request epoch |
but then perhaps we're going too far down the road of just replicating what we already have in the Validator
structure.
Ultimately, I think that with the SSZ implementation we shouldn't be attempting to redefine the status value, just provide a suitable representation. So I'm inclined to stick with the simple list rather than the (admittedly more fun) bit twiddling approach.
Ok fair, and we don't want gaps for unthought of statuses?
Ok fair, and we don't want gaps for unthought of statuses?
I think we can just add to the list if required, they aren't in strict timeline order anyway given the absence/presence of the slashed flag in the validator itself.
The other thing i guess is the validators list is already actually defined as an ssz object, and we'd be doing something else... so it cant just be read in as a list of validators... We'd likely want a new endpoint to give this new object type back....
The existing endpoint i'd be happy to entertain sending back SSZ, but not if the object definition is inconsistent, just my 2c
Well the JSON is not a spec object, but still a well-defined output (as are various other entities within the API). I don't think that we're doing anything structurally different between returning JSON and SSZ, just defining the SSZ encoding for one particular enum.
We should be able to return an SSZ equivalent for any JSON, and that shouldn't cause us to have to bump or change the endpoint (unless we really mucked up the JSON definition).
The validator
attribute is defined in SSZ, the rest is just context harvested from state i guess...
https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/beacon-chain.md#validator
Ok as long as the validator object is consistent, i'm relatively happy...
Yeah it's the additional items on top (index/balance/state) that make it a custom object. This was somewhat contentious at the time, as it was the only output in the first cut of the API that was an "augmented" spec object rather than an actual spec object, and we could argue looking back now that perhaps we shouldn't have added these fields and just returned the spec struct, but that ship has sailed.
But yes, we'll definitely be expecting the validator
struct to be encoded as the spec object.
We should be able to return an SSZ equivalent for any JSON, and that shouldn't cause us to have to bump or change the endpoint (unless we really mucked up the JSON definition).
The only reasonable way to reach this point is to create a canonical JSON mapping for SSZ and limit the amount of JSON-specific types we have floating around - ie avoiding JSON:isms that are hard to represent in SSZ.
avoiding JSON:isms that are hard to represent in SSZ.
We already have one of those here, in the form of the validator status. It isn't hard to represent per se, it just requires a definition hence the work above.
Are there any objections to me creating a PR based on the information in this issue?
Are there any objections to me creating a PR based on the information in this issue?
It's a good idea, we can decide whether it should be required or optional in that PR...
The
/eth/v1/beacon/states/<state_id>/validators
endpoint returns a lot of data. At the time of writing, this endpoint returns around 370MB of JSON. This requires a fair amount of CPU and memory resources on both the server, which has to convert from its internal data structures to JSON, and the client, which has to do the opposite.SSZ is already used in a number of other areas where there is large amounts of data, such as the beacon state, and would reduce the burden on both server and client. The SSZ representation of the validators at time of writing is around 112MB, less than 1/3 of the size. Additionally, the CPU and memory required for encoding and decoding should be reduced due to the similarity of the native and SSZ formats.
The reason that this is an issue rather than a PR is that it would require work from client teams, most notably they would need to support that beacon node API validator object, which contains the spec validator object plus some additional fields. As such, I would like to hear from client teams as to if this is something that they would or would not be interested in supporting.