Support SSZ for validators endpoint

mcdee commented 1 year ago

The /eth/v1/beacon/states/<state_id>/validators endpoint returns a lot of data. At the time of writing, this endpoint returns around 370MB of JSON. This requires a fair amount of CPU and memory resources on both the server, which has to convert from its internal data structures to JSON, and the client, which has to do the opposite.

SSZ is already used in a number of other areas where there is large amounts of data, such as the beacon state, and would reduce the burden on both server and client. The SSZ representation of the validators at time of writing is around 112MB, less than 1/3 of the size. Additionally, the CPU and memory required for encoding and decoding should be reduced due to the similarity of the native and SSZ formats.

The reason that this is an issue rather than a PR is that it would require work from client teams, most notably they would need to support that beacon node API validator object, which contains the spec validator object plus some additional fields. As such, I would like to hear from client teams as to if this is something that they would or would not be interested in supporting.

dapplion commented 1 year ago

Why not fetch the entire state via the debug endpoint as SSZ? Most of its SSZ serialized size is the validators list, and then its simple to produce a ValidatorResponse from the full state. Note that SSZ does not support strings so we would need to encode ValidatorStatus in some non-obvious way.

Given the current number of beacon nodes implementations I would prefer to offload data transformations to downstream tooling, and limit the beacon node's duty to expose all data necessary for such tools.

mcdee commented 1 year ago

Yes that's an option, and one that I have played with (the size of state is broadly equivalent to the size of the validators response), but for the general-purpose consumer they won't know to do this. Unless we deprecate the /valdiators endpoint entirely and point people to the /state endpoint, but if we were to do that we'd probably want to move the /state endpoint out of the debug namespace.

Agree that we'd need to define the validator status, but figure that would be part of the endpoint definition.

Given the current number of beacon nodes implementations I would prefer to offload data transformations to downstream tooling, and limit the beacon node's duty to expose all data necessary for such tools.

This was certainly the initial intent (the /validators endpoint was a bit contentious because it didn't expose a simple spec object) but seems to have been diluted over time with the addition of various other endpoints such as rewards. I don't have a problem with us still using that as a guiding principle for new endpoints, though.

dapplion commented 1 year ago

A way I look at the debug / not distinction is:

non-debug endpoints: can disrupt the node a bit but cannot kill it. These endpoints should be protected but are ok to be enabled by default on all users
debug endpoints: can kill the node if abused. Should not be enabled by default on all users, and only enabled for researchers / explorers / debuggers.

So in that sense I think the /validators endpoint should be moved to debug since it can trigger a similar amount of load than the debug states. Also dumping the entire validators array falls into the "debugger / explorer / researcher" category. I'm not sure when that endpoint is of use for stakers

mcdee commented 1 year ago

We have lots of endpoints that aren't required for validating, they sit in the "validator required" group. But we don't define their path based on that criterion.

I'm pretty sure that most endpoints could create a lot of load on beacon nodes if desired, including many that are required for validator operation. As such, I don't think that a rework of the hierarchy based on "potential load" would be a useful way of going. Also worth pointing out that the /validators endpoint can take a list of validators for which to provide information, so it isn't always only used for dumping all validators but can be a subset of them.

dapplion commented 1 year ago

I'm pretty sure that most endpoints could create a lot of load on beacon nodes if desired, including many that are required for validator operation. As such, I don't think that a rework of the hierarchy based on "potential load" would be a useful way of going

That's my main consideration regarding this issue. But I understand your points regarding usefulness. I don't want to speak for other implementations so I would like to know their take.

michaelsproul commented 1 year ago

I think it wouldn't be too hard to support, we'd just need a canonical encoding for the validator status? Maybe UTF-8 in a ~~List[Uint8; 32]~~ List[Byte; 32]?

arnetheduck commented 1 year ago

This reminds me of https://github.com/ethereum/consensus-specs/pull/2983 which I still intend to get back to at some point - basically, that PR is held up on deciding whether a ParticipationFlag is a number / integer (it is not per its semantics, but the beacon api unfortunately defines it as such in its json encoding which is the sole exception we have so far).

Regarding List[Uint8; 32] - there's risk for similar ambiguity here that should also be resolved: the beacon api assigns more or less meaning to byte vs uint8 (number) depending on whether you consider alias to mean "introduces a new distinct type and encoding" vs "introduces a new name for the same type and encoding" - my interpretation is that alias means what the dictionary says, ie new name only, but there is disagreement here. Long story short, uint8 would mean that the validator status is a number which is not quite right (you can't add two statuses): List[byte...] on the other hand would mean "byte" which should correspond to a hex encoding on the json side.

michaelsproul commented 1 year ago

+1 for byte, my Rust-brain sees them as identical, but I agree we should avoid any implication that it's a number

mcdee commented 1 year ago

So for validator state we can define as follows:

Value	State
0x00	Unknown
0x01	Pending initialized
0x02	Pending queued
0x03	Active ongoing
0x04	Active exiting
0x05	Active slashed
0x06	Exited unslashed
0x07	Exited slashed
0x08	Withdrawal possible
0x09	Withdrawal done

Is this suitable?

rolfyone commented 1 year ago

So for validator state we can define as follows:

Value State 0x00 Unknown 0x01 Pending initialized 0x02 Pending queued 0x03 Active ongoing 0x04 Active exiting 0x05 Active slashed 0x06 Exited unslashed 0x07 Exited slashed 0x08 Withdrawal possible 0x09 Withdrawal done Is this suitable?

as long as we don't want to leave gaps or use byte type filters...

If we wanted to we could use bytes to our advantage...

I haven't given this a ton of thought but something like 0x0? is not yet active 0x1? is active in some capacity 0x2? is exited onwards...

Thoughts? would allow checking 0x1? as anything active, which is attractive but not sure if thats too old school...

mcdee commented 1 year ago

Yes, I did consider this and it does have some advantages but I ended up discarding it. One possibility:

Bit	Meaning
0	Is slashed
1	Has non-zero balance
2	Is pending
3	Is active
4	Is exiting
5	Is exited
6	Is withdrawable

which I think would cover all the situations that we need, but still feels like it's a list of statuses. An alternative:

Bit	Meaning
0	Is slashed
1	Has non-zero balance
2	Activation eligibility epoch <= request epoch
3	Activation epoch <= request epoch
4	Exit epoch <= request epoch
5	Withdrawable epoch <= request epoch

but then perhaps we're going too far down the road of just replicating what we already have in the Validator structure.

Ultimately, I think that with the SSZ implementation we shouldn't be attempting to redefine the status value, just provide a suitable representation. So I'm inclined to stick with the simple list rather than the (admittedly more fun) bit twiddling approach.

rolfyone commented 1 year ago

Ok fair, and we don't want gaps for unthought of statuses?

mcdee commented 1 year ago

Ok fair, and we don't want gaps for unthought of statuses?

I think we can just add to the list if required, they aren't in strict timeline order anyway given the absence/presence of the slashed flag in the validator itself.

rolfyone commented 1 year ago

The other thing i guess is the validators list is already actually defined as an ssz object, and we'd be doing something else... so it cant just be read in as a list of validators... We'd likely want a new endpoint to give this new object type back....

The existing endpoint i'd be happy to entertain sending back SSZ, but not if the object definition is inconsistent, just my 2c

mcdee commented 1 year ago

Well the JSON is not a spec object, but still a well-defined output (as are various other entities within the API). I don't think that we're doing anything structurally different between returning JSON and SSZ, just defining the SSZ encoding for one particular enum.

We should be able to return an SSZ equivalent for any JSON, and that shouldn't cause us to have to bump or change the endpoint (unless we really mucked up the JSON definition).

rolfyone commented 1 year ago

The validator attribute is defined in SSZ, the rest is just context harvested from state i guess...

https://github.com/ethereum/consensus-specs/blob/dev/specs/phase0/beacon-chain.md#validator

Ok as long as the validator object is consistent, i'm relatively happy...

mcdee commented 1 year ago

Yeah it's the additional items on top (index/balance/state) that make it a custom object. This was somewhat contentious at the time, as it was the only output in the first cut of the API that was an "augmented" spec object rather than an actual spec object, and we could argue looking back now that perhaps we shouldn't have added these fields and just returned the spec struct, but that ship has sailed.

But yes, we'll definitely be expecting the validator struct to be encoded as the spec object.

arnetheduck commented 1 year ago

We should be able to return an SSZ equivalent for any JSON, and that shouldn't cause us to have to bump or change the endpoint (unless we really mucked up the JSON definition).

The only reasonable way to reach this point is to create a canonical JSON mapping for SSZ and limit the amount of JSON-specific types we have floating around - ie avoiding JSON:isms that are hard to represent in SSZ.

mcdee commented 1 year ago

avoiding JSON:isms that are hard to represent in SSZ.

We already have one of those here, in the form of the validator status. It isn't hard to represent per se, it just requires a definition hence the work above.

mcdee commented 1 year ago

Are there any objections to me creating a PR based on the information in this issue?

rolfyone commented 1 year ago

Are there any objections to me creating a PR based on the information in this issue?

It's a good idea, we can decide whether it should be required or optional in that PR...

ethereum / beacon-APIs

Support SSZ for validators endpoint #333