Open bhartnett opened 1 month ago
cc @KolbyML @kdeme
Gossip's main usecase is gossiping to other nodes on the network. Normally this is done via bridges. I am not sure if this is an endpoint high level users would actual use.
Because gossip is normal used by bridges which don't want to store that content I am not sure if I see the utility of that.
Also validation would be limited via this API call as for example if we are gossiping block bodies we can't validate canonicalness without a network look on the bridge side, which I don't think we should do that as bandwidth is very limited.
Anyways I am very open to what others think. I think portal_GetContent
really makes sense for users, but I don't see how that is the same with gossip
Gossip's main usecase is gossiping to other nodes on the network. Normally this is done via bridges. I am not sure if this is an endpoint high level users would actual use.
Because gossip is normal used by bridges which don't want to store that content I am not sure if I see the utility of that.
Also validation would be limited via this API call as for example if we are gossiping block bodies we can't validate canonicalness without a network look on the bridge side, which I don't think we should do that as bandwidth is very limited.
Anyways I am very open to what others think. I think
portal_GetContent
really makes sense for users, but I don't see how that is the same with gossip
Yes, it is used by bridges and in that case the bridge is usually connected to a local portal client node which is also participating in the network so it makes sense to store the content locally if it falls within the local nodes radius so that the content can be shared with other nodes on request. If disabling local storage is a requirement then clients can set their storage capacity to zero to prevent the local storage.
For the validation, I agree we shouldn't do network calls so we can't validate canonicalness but other local validations should still be done.
The purpose for the rename was to bring some symmetry and a counterpart to the portal_GetContent
but not necessarily that it would be used by end users.
Because gossip is normal used by bridges which don't want to store that content I am not sure if I see the utility of that.
In my opinion, this is not good argument (of course, storing it locally would only happen if it falls within radius):
With this in mind, I don't see why we shouldn't add this to the spec.
Regarding validation, I don't have strong opinion. It can be either fully validated (as if content came from other node on the network), it can be validated as much as portal_*Store
requests, or it can be configurable with a flag.
@bhartnett
Anyways I will probably get used to the name portal_*PutContent
over time.
I think we should be more specific what we mean by validation in the RPC's description. With the examples given I understand where you guys are coming from for the storage example now. Also we wouldn't want to set a bridges radius to 0 or it would be easily detectable which is bad. So storing it first isn't a bad thing thinking more about it.
Other then that the PR looks good
Regarding validation, I don't have strong opinion. It can be either fully validated (as if content came from other node on the network)
That would add a lot of overhead in terms of my block body
or block receipt
example. Also if a node offers you not provable data maybe that is something peerscore should handle. It could be dangerous though if an dos attack was to occur on the network. So I think a flag or no network-call are the only 2 feasible options. Maybe the flag approach could prevent naive users from doing weird stuff if it is enabled by default.
That would add a lot of overhead in terms of my
block body
orblock receipt
example. Also if a node offers you not provable data maybe that is something peerscore should handle. It could be dangerous though if an dos attack was to occur on the network. So I think a flag or no network-call are the only 2 feasible options. Maybe the flag approach could prevent naive users from doing weird stuff if it is enabled by default.
@KolbyML I agree, full validation would slow down the bridge significantly and if we did end up deciding to do full validation then yes it should be configurable and perhaps enabled by default to prevent surprises.
Having said that, I think this PR shouldn't try to cover categorizing the types of validation because that needs to be done for all the endpoints and it probably should be handled as a separate PR. It would be good to have the types/categories of validation clearly defined.
Regarding the validation wording, sure I'm happy to update to something more clear. My suggestion is that for this PR, the validation should be the same type as done in portal_*GetContent.
Regarding the validation wording, sure I'm happy to update to something more clear. My suggestion is that for this PR, the validation should be the same type as done in portal_*GetContent.
maybe for
portal_*GetContent
we can call it "Full Validation"portal_*PutContent
we can call it "Local Validation"in the descriptions for the respective endpoints
I think that is a simple enough scheme which gets the point across
Regarding the validation wording, sure I'm happy to update to something more clear. My suggestion is that for this PR, the validation should be the same type as done in portal_*GetContent.
maybe for
portal_*GetContent
we can call it "Full Validation"portal_*PutContent
we can call it "Local Validation"in the descriptions for the respective endpoints
I think that is a simple enough scheme which gets the point across
Yes something like this could work well enough for now I guess but one issue I see is that 'full validation' should probably include checking if the data is canonical which would include a network call. For example if you look up a trie node from an uncle block which shouldn't exist in the portal network a malicious node could return it to you if you don't check it in some way. This is very much an edge case and I'm not suggesting we actually need to do this kind of validation but my point is that 'full validation' should probably include canonical checks which we shouldn't do in this case. So probably a different type of validation is needed here.
In my last update I'm describing the validation in portal_*GetContent
as 'validation for correctness' and the validation in portal_*PutContent
as 'validate that the content is well formed'. Hopefully this is ok for now.
In my last update I'm describing the validation in
portal_*GetContent
as 'validation for correctness' and the validation inportal_*PutContent
as 'validate that the content is well formed'. Hopefully this is ok for now.
Both examples sound like the same thing but worded differently. I don't think it clear enough. If we are specifying validation here, we should specify it. If we want to properly write the description for validation for portal_*PutContent
in a future PR, which should add the word validation to the portal_*PutContent
in that follow up PR.
Yes something like this could work well enough for now I guess but one issue I see is that 'full validation' should probably include checking if the data is canonical which would include a network call. For example if you look up a trie node from an uncle block which shouldn't exist in the portal network a malicious node could return it to you if you don't check it in some way. This is very much an edge case and I'm not suggesting we actually need to do this kind of validation but my point is that 'full validation' should probably include canonical checks which we shouldn't do in this case. So probably a different type of validation is needed here.
I think Full Validation
would mean the fullest validation possible. portal_*GetContent
is unlikely to be used on the State network other than maybe the wallet use case. Where I assume eth_* calls would be used to reduce code duplication and complexity.
If the validation part isn't relevant to this PR we shouldn't add it to the description and properly do it in a follow up PR. If the validation description is an important part and wants to be kept in this PR people should be able to understand what it means.
In my last update I'm describing the validation in
portal_*GetContent
as 'validation for correctness' and the validation inportal_*PutContent
as 'validate that the content is well formed'. Hopefully this is ok for now.Both examples sound like the same thing but worded differently. I don't think it clear enough. If we are specifying validation here, we should specify it. If we want to properly write the description for validation for
portal_*PutContent
in a future PR, which should add the word validation to theportal_*PutContent
in that follow up PR.Yes something like this could work well enough for now I guess but one issue I see is that 'full validation' should probably include checking if the data is canonical which would include a network call. For example if you look up a trie node from an uncle block which shouldn't exist in the portal network a malicious node could return it to you if you don't check it in some way. This is very much an edge case and I'm not suggesting we actually need to do this kind of validation but my point is that 'full validation' should probably include canonical checks which we shouldn't do in this case. So probably a different type of validation is needed here.
I think
Full Validation
would mean the fullest validation possible.portal_*GetContent
is unlikely to be used on the State network other than maybe the wallet use case. Where I assume eth_* calls would be used to reduce code duplication and complexity.If the validation part isn't relevant to this PR we shouldn't add it to the description and properly do it in a follow up PR. If the validation description is an important part and wants to be kept in this PR people should be able to understand what it means.
I understand where you are coming from and agree with your points above in general but I would argue the the terms 'Full validation' and 'Local validation' are also not completely clear until we define these terms/categories in another location in the spec. I agree that should be done in a separate PR.
In that case, I'm happy to remove the references to validation if that is preferred. Note that portal_*GetContent
already mentions validation and I didn't introduce that in this PR (previous PR) so one could argue that mentioning validation in portal_*PutContent
at the same level of detail does no harm as we know it will require some form of validation.
In that case, I'm happy to remove the references to validation if that is preferred. Note that
portal_*GetContent
already mentions validation and I didn't introduce that in this PR (previous PR) so one could argue that mentioning validation inportal_*PutContent
at the same level of detail does no harm as we know it will require some form of validation.
I think the distinguishing factor here is now validation
means 2 different things for 2 different endpoints. If its meaning was consistent I would be more lenient with it. I was lenient with it in the portal_*GetContent
call because at the time it was the only usage of the word "validation" in our json-rpc schema.
Well, of course, it wouldn't be too hard for us to figure out what that means; I think the specs should be written explicitly so even an outsider could understand what is implied.
In short I think using the same word to define 2 different expectations is misleading and does harm to people trying to use our specs.
If this is a high level call I don't think we should say the call does validation as that isn't really relevant to the person making the call it is more of an implementation detail in that case I think it would make more sense to define it https://github.com/ethereum/portal-network-specs/pull/345 here. And instead try to focus on higher level description in our JSON-RPC schema
then leave implementation details like how validation is handle in the Specification.md
per method.
If that is the case though in our json-rpc schema
all of the endpoints would contain the word validation
in their description to the point it is implicit and since validation
would mean a different thing per endpoint I don't think any endpoint should specify it in their description as it would be happening for every JSON-RPC method, but validation would mean different things for most of them or their would be groups of endpoints which do the same form of validation but you get the point.
In that case, I'm happy to remove the references to validation if that is preferred. Note that
portal_*GetContent
already mentions validation and I didn't introduce that in this PR (previous PR) so one could argue that mentioning validation inportal_*PutContent
at the same level of detail does no harm as we know it will require some form of validation.I think the distinguishing factor here is now
validation
means 2 different things for 2 different endpoints. If its meaning was consistent I would be more lenient with it. I was lenient with it in theportal_*GetContent
call because at the time it was the only usage of the word "validation" in our json-rpc schema.Well, of course, it wouldn't be too hard for us to figure out what that means; I think the specs should be written explicitly so even an outsider could understand what is implied.
In short I think using the same word to define 2 different expectations is misleading and does harm to people trying to use our specs.
If this is a high level call I don't think we should say the call does validation as that isn't really relevant to the person making the call it is more of an implementation detail in that case I think it would make more sense to define it #345 here. And instead try to focus on higher level description in our
JSON-RPC schema
then leave implementation details like how validation is handle in theSpecification.md
per method.If that is the case though in our
json-rpc schema
all of the endpoints would contain the wordvalidation
in their description to the point it is implicit and sincevalidation
would mean a different thing per endpoint I don't think any endpoint should specify it in their description as it would be happening for every JSON-RPC method, but validation would mean different things for most of them or their would be groups of endpoints which do the same form of validation but you get the point.
This sounds reasonable to me. I've just pushed another change which removes the references to validation from the method descriptions for both rpc methods portal_*GetContent
and portal_*PutContent
.
I suggest that we rename the
portal_*Gossip
JSON-RPC endpoint toportal_*PutContent
and also update the method definition to include storing the content in the local database after performing some basic validations.Changes in this PR:
portal_*Gossip
JSON-RPC endpoint toportal_*PutContent
This is a similar update to https://github.com/ethereum/portal-network-specs/pull/344. My reasoning for this change is that I think we should have a high level endpoint counterpart to
portal_*GetContent
with similar naming and implementation storing in the local database when it makes sense to do so.