ethereum / beacon-APIs

Collection of RESTful APIs provided by Ethereum Beacon nodes
https://ethereum.github.io/beacon-APIs/
Creative Commons Zero v1.0 Universal
335 stars 169 forks source link

Validator api - spec 0.9.2 #12

Closed mpetrunic closed 4 years ago

mpetrunic commented 4 years ago

I would propose new validator API as current API is causing troubles in this aggregation strategy. Problems are:

I would propose following:

Other apis will remain the same (attestation type needs change).

After we have agreed api, I'll open PR on repo.

mkalinin commented 4 years ago

I would suggest following schema:

We could keep an old one as well if it would exist:

mpetrunic commented 4 years ago

GET /eth/validator/[wire_]attestations/{epoch}?committee_index=idx

How about GET `/eth/validator/wire_attestations/{epoch}/{commiteeIndex} ?

We could keep an old one as well if it would exist:

  • GET /eth/validator/duties/{epoch} But in current schema epoch parameter goes in the query instead of a path.

Is there a reason you would like to keep that one? Is it suppose to subscribe to committee attestation topics?

Seems like it's way too early to keep backward compatibility :smile:

mkalinin commented 4 years ago

Seems like it's way too early to keep backward compatibility 😄

Totally agreed! Hence, no need to keep this old fashioned duties request.

How about GET /eth/validator/wire_attestations/{epoch}/{commiteeIndex} ?

I think it's better to move them both into a query string:

mpetrunic commented 4 years ago

@mkalinin about that wire attestations, what was your intention with epoch param?

Wouldn't it make more sense to replace epoch with attestationDataRoot(hashTreeRoot(attestation.data)?

mkalinin commented 4 years ago

The intention was not to mix attestations produced by committees with same index in a different epochs. Validators of the same committee may produce attestations with different data, e.g. different view to a fork, and aggregator have to produce an aggregate for each of them.

djrtwo commented 4 years ago

The attestationsubnets have a validation condition that only unaggregated attestations are sent. Similarly, when doing aggregation by the v-client, I would expect `/eth/validator/[wire]attestations/{epoch}?committee_index=idx` to strictly only return valid unaggregated attesations (that were disseminated on the subet). Might be worth specifying that these are expected to be fully unaggregated (1 participant per).

djrtwo commented 4 years ago

otherwise, these look like good/valid additions to the api

paulhauner commented 4 years ago

GET /eth/validator/duties/attester/{epoch}?publicKeys=public1, public2 - returns [{publicKey, slot, committeeIndex}] for given publicKeys and subscribes to topics related to returned commiteeIndices

We've had problems here where we quickly run out of space in the URL when there are many pubkeys (definitely fails when there's 100 keys). To resolve this, we made a POST method that accepts the keys in the body of the request.


GET /eth/validator/duties/attester/{epoch}?publicKeys=public1, public2 - returns [{publicKey, slot, committeeIndex}] for given publicKeys and subscribes to topics related to returned commiteeIndices

I'm a little concerned a mixing the concerns of "I want to know duties" with "I want to aggregate attestations". For example, I suspect that block explorers and other UIs might want to read this data without causing an affect in the node. Additionally, this is a GET request that causes an action on the server which is the basis of a CSRF vulnerability.

I'm also unsure of how the beacon node knows if it needs to subscribe to the topic, because it doesn't have the slot_signature and can't run is_aggregator. Perhaps I'm missing something?

I would suggest the following endpoint:

This would cause the beacon node to start listening on the aggregation pubsub topics:


GET /eth/validator/attestations/{commiteeIndex} - returns all attestations collected from that committee index topic

I would have thought we can keep more logic inside the beacon node (and out of the validator client) with the following endpoint (it's basically your /eth/validator/aggregate endpoint):

This way the beacon node can perform the aggregation itself and then publish the proof.

Note: the beacon node should already know the selection_proof if it has been configured to aggregate on that topic (via /eth/validator/slot_signature). If it hasn't been configured to aggregate on this topic then this request should probably fail?

mpetrunic commented 4 years ago

We've had problems here where we quickly run out of space in the URL when there are many pubkeys (definitely fails when there's 100 keys). To resolve this, we made a POST method that accepts the keys in the body of the request.

I was thinking about putting some sane limit here like 10 pubkeys, I don't think arbitrary number of pubkeys should be allowed (allows you to circumvent api rate limiting etc)

I'm a little concerned a mixing the concerns of "I want to know duties" with "I want to aggregate attestations". For example, I suspect that block explorers and other UIs might want to read this data without causing an affect in the node. Additionally, this is a GET request that causes an action on the server which is the basis of a CSRF vulnerability.

I'm also unsure of how the beacon node knows if it needs to subscribe to the topic, because it doesn't have the slot_signature and can't run is_aggregator. Perhaps I'm missing something?

That's a good point, I agree.

I would suggest the following endpoint:

  • POST /eth/validator/slot_signature: accepts slot_signature and aggregator_index.

This would cause the beacon node to start listening on the aggregation pubsub topics:

You would need slot, to verify signature?

Maybe something like: POST /eth/validator/subscribe/{commiteeIndex} // to be more clearer about expected action Body:

{
   "slot": 12 //you are suppose to call this epoch earlier to allow Bn to find peers, and with slot , BN can decide not to store attestations before that slot
   "aggregator_pubkey": "0x123..." //as we identify validator by publickey in api
   "slot_signature": "0x123..." // so you can discard request from "non-validators
}

I would have thought we can keep more logic inside the beacon node (and out of the validator client) with the following endpoint (it's basically your /eth/validator/aggregate endpoint):

  • POST /eth/validator/broadcast_aggregate: accepts {aggregator_index, attestation_data_root}

This way the beacon node can perform the aggregation itself and then publish the proof.

Note: the beacon node should already know the selection_proof if it has been configured to aggregate on that topic (via /eth/validator/slot_signature). If it hasn't been configured to aggregate on this topic then this request should probably fail?

That was my initial idea, but under assumptions that validator can subscribe to more than one BN, validator won't be able to aggregate all attestations from all BNs unless first BN aggregated like first N/2 and second BN aggregated last N/2 attestations.

@paulhauner ^^

mkalinin commented 4 years ago

We've had problems here where we quickly run out of space in the URL when there are many pubkeys (definitely fails when there's 100 keys). To resolve this, we made a POST method that accepts the keys in the body of the request.

A good explanation on URL length limit is given on Stackoverflow: https://stackoverflow.com/questions/417142/what-is-the-maximum-length-of-a-url-in-different-browsers

Browsers does not support URLs longer than ~2000 characters in their address bar. While servers are not restricted in supporting URLs longer than that, this parameter should be configurable. Bypassing this limitation by using POST request instead of GET will make you to set Access-Control-Allow-Origin: * on a server side if you want this response accepted by browsers for some reason. Sending this header in the response in its turn pulls off CSRF protection. A rule of thumb is to use GET method for read-only requests and POST method for modifications.

For example, I suspect that block explorers and other UIs might want to read this data without causing an affect in the node. Additionally, this is a GET request that causes an action on the server which is the basis of a CSRF vulnerability.

Agreed. GET requests should not modify a state of the node. We need a series of requests to manage aggregation if we want beacon node to be responsible for the most part of that job (which sounds reasonable), something like:

Passing epoch in the subscription gives node a notion of epoch to collect attestations in as it could be the case when node is subscribed to committees in a span of epochs.

IMO, subscription should be automatically expired by the node at the end of the slot related to the committee with given index and epoch. And all resources should be released on the expiration procedure.

In more sophisticated setups when a VC is connected to several BNs, how this aggregate command will look like? I assume that these BNs have strong connections with each other and quickly shares all the bits of information they have with their neighbors, thus there will be no difference from VC point of view which node is used as a source of an aggregate. However, VC will have to send subscription requests to all of them.

paulhauner commented 4 years ago

I don't think arbitrary number of pubkeys should be allowed (allows you to circumvent api rate limiting etc)

Seems reasonable, but I feel this should be done on a client-by-client basis. I'd be tempted to wait until it's clear that too many pubkeys in one request is a problem for public APIs before we start applying restrictions.


Bypassing this limitation by using POST request instead of GET will make you to set Access-Control-Allow-Origin: * on a server side if you want this response accepted by browsers for some reason.

This is a good point @mkalinin. Perhaps we should keep the GET request around for usability and I can just implement a POST request to suit our needs whilst we're running massively centralized VCs in testnets.


Passing epoch in the subscription gives node a notion of epoch to collect attestations in as it could be the case when node is subscribed to committees in a span of epochs.

Agreed, I think @mpetrunic's suggestion to add the slot would also cover this.


IMO, subscription should be automatically expired by the node at the end of the slot related to the committee with given index and epoch. And all resources should be released on the expiration procedure.

I totally agree.


POST /eth/validator/beacon_committee_subscription with {epoch: 0, committee_index: 0} payload POST /eth/validator/aggregate with {epoch: 0, committee_index: 0} payload and Attestation message (an aggregate) in the response POST /eth/validator/aggregate_and_proof with AggregateAndProof as a payload

It seems to me that if we include slot_signature in beacon_committee_subscription there's no need for the /eth/validator/aggregate_and_proof endpoint and therefore we remove one round-trip.


In more sophisticated setups when a VC is connected to several BNs, how this aggregate command will look like? I assume that these BNs have strong connections with each other and quickly shares all the bits of information they have with their neighbors, thus there will be no difference from VC point of view which node is used as a source of an aggregate. However, VC will have to send subscription requests to all of them.

My thoughts were the the VC should only really try to elect one BN as the aggregator and try to use that BN as the publisher.

If it does happen to trigger a couple of BNs (perhaps the first one goes offline for a little bit), then it's just a little more traffic on the pubsub network with no significant cost.

mpetrunic commented 4 years ago

I would still opt to allow VC to collect attestations from multiple BN and publish aggregate on whichever node he wan't. It would allow us to implement more robust and performant VC.

We could have optional endpoint for lightweight VC, where BN would aggregate and publish on behalf of validator.

mkalinin commented 4 years ago

It seems to me that if we include slot_signature in beacon_committee_subscription there's no need for the /eth/validator/aggregate_and_proof endpoint and therefore we remove one round-trip.

Agree, this would work if a beacon node is responsible for aggregation and further propagation of computed aggregate.

I would still opt to allow VC to collect attestations from multiple BN and publish aggregate on whichever node he wan't. It would allow us to implement more robust and performant VC.

Don't you think it's an overhead for VC instance?

mpetrunic commented 4 years ago

I would still opt to allow VC to collect attestations from multiple BN and publish aggregate on whichever node he wan't. It would allow us to implement more robust and performant VC.

Don't you think it's an overhead for VC instance?

That's why I'm proposing two endpoints, one for more robust and resistant VC and other for people who would like to run VC on RPI or whatever. There is also a issue of evil BN because you are giving it signature so it can publish invalid aggregation without your knowledge.

There won't be thousands of attestations to aggregate and you aren't doing it often so I don't think this would be issue even for weak VC.

paulhauner commented 4 years ago

I would still opt to allow VC to collect attestations from multiple BN and publish aggregate on whichever node he wan't. It would allow us to implement more robust and performant VC.

Presently we do not store un-aggregated attestations at all; we aggregate and then drop the original. In order to implement this API we'd need to make data-structure changes for the sole purpose of supporting this API that we do not have an immediate need for.

I think this scenario is valid but too perhaps too advanced to be placed in the API at this stage. I think we should be targeting "all clients have implemented a minimal API" instead of "all clients have implemented different parts of a wide-reaching API".

If several clients do intend to implement the multi-BN aggregation scheme immediately then I'm happy to be wrong here :)