`target_hash` RPC Level validation

wilwade commented 5 months ago

As someone transacting using stateful storage, I want to have my transaction stopped at the node level if I submit something with a bad target_hash so that I don't waste capacity on a transaction that will fail.

Scenario

User has two providers, A and B
User is actively using both A and B
User makes a change on A and a change on B that both effect the same stateful store
Provider A makes the change
Provider B uses the cached value from before the change by A
The changes are in a race condition
Provider B would like to safely use the cached value without fear of an invalid hash using capacity

Checklist

[ ] Make sure the pallet benchmarks are correct
[ ] Make sure the overhead benchmarks are correct

Questions

Should this validation run on pre_dispatch as well as validate?
- A: We want to do both
- If pre_dispatch and validate, we should remove the check in the extrinsic
- If just validate, we cannot remove the extrinsic check, but also no weights etc...
Should other checks also be moved into the signed extrinsics?
- Just the one check for the hash

Draft PR: https://github.com/LibertyDSNP/frequency/pull/1288

JoeCap08055 commented 5 months ago

I think one question to answer is: is it possible to read only the first N bytes of a page? I hate to have to read the entire page in the signed extension when we only care about the page hash in the first few bytes of the page storage for each page.

saraswatpuneet commented 5 months ago

Few things we can check if possible

Hash , if hash being sent is stale it would be good to know before submitting the tx and take appropriate actions to recover
MSAid/delegation ( not important as we check in services side of user is delegated or not but nice to have )
Check for correctness of provided SchemaId

enddynayn commented 5 months ago

Please remind me how someone might gat a bad_hash. Also, Can you please elaborate on why is this a problem?

saraswatpuneet commented 5 months ago

Should we provide an RPC to validate a hash ? That way we can check for correctness of hash at the time of computing it via graphsdk

Advantage: we can proactively make sure only valid hashes are being sent to chain than wait for chain to tell us it's bad ?

aramikm commented 5 months ago

Our RPC returns current hash once a page is retrieved. Not sure if we need an additional RPC.

saraswatpuneet commented 5 months ago

Is there a way to know with the above info if a given hash is stale or not ?

aramikm commented 5 months ago

I think one question to answer is: is it possible to read only the first N bytes of a page? I hate to have to read the entire page in the signed extension when we only care about the page hash in the first few bytes of the page storage for each page.

Not possible. Accessing the node is the heavier side of the operation compared to deserialization to the type.

Is there a way to know with the above info if a given hash is stale or not ?

RPC only returns the data already stored on blockchain (so it might not be finalized but we have this problem in everywhere and also it does not consider anything in the transaction pool). So from RPC standpoint it only return the valid hash at the exact time asked but then it might not be accurate for T + 1 but we can't do much about it.

My recommendation is that we do not change the extrinsic implementations and only add following checks in ~pre_dispatch~( as discussed pre_dispatch is would not help with eliminating the transaction before propagation) validate.

Check hash staleness
Check Schema validity

wilwade commented 5 months ago

My recommendation is that we do not change the extrinsic implementations and only add following checks in pre_dispatch to provide fail fast feature without adding any overhead on the whole network.
* Check hash staleness
* Check Schema validity

I like this. Simple and provides the number one thing we are missing: RPC safety.

enddynayn commented 5 months ago

only add following checks in pre_dispatch

@wilwade

It is not good practice to not call your validate function in pre-dispatch. Your pre-dispatch must call your validation function. Additionally, we would need to run the overhead benchmarks for this. This implies that all our transactions include this additional weight.

@aramikm If you intend to add it to pre-dispatch will this be included in the block. Thus, not much advantage over keeping it in the extrinsic. However, you might be saying to add it to both validate and pre-dispatch.

I think these should be uniquely booted at the validation level. We have an example in one of our extrinsic. Overall, it is unclear to me how much of a problem this is. If it does become one we might need to revisit or storage solution.

JoeCap08055 commented 5 months ago

My recommendation is that we do not change the extrinsic implementations and only add following checks in pre_dispatch to provide fail fast feature without adding any overhead on the whole network.

Check hash staleness

Check Schema validity

I disagree with "Check schema validity". We have plenty of extrinsics that check schema validity; what's the argument for checking only this one in pre_dispatch? Content hash is a special case unique to stateful storage, and the potential for a rejected call is outside the control of the caller. A bad schema ID is entirely the fault of the caller and within their control to prevent; no need to add a schema and/or delegation read to the pre_dispatch as well.

wilwade commented 5 months ago

It is not good practice to not call your validate function in pre-dispatch. Your pre-dispatch must call your validation function. Additionally, we would need to run the overhead benchmarks for this. This implies that all our transactions include this additional weight.

@enddynayn I agree that this is uncommon, but I think this is a great example of a place to do it. We have a validation in the extrinsic that is (from my understanding) expensive to extract to pre_dispatch. Assuming that is true, this is a great use case of just blocking honest RPCs from causing unneeded bad transactions to be gossiped.

The danger would be if someone thinks it is enough to just do the validate. However, in this case the setup would be clear that this is just a helper for honest actors.

That said @aramikm, if the worry is that we have to add a database read, I don't think this is so. The caching layer would cache the first read in validate and then it would have a cache hit on the second read in the extrinsic. Or perhaps I am missing a piece?

enddynayn commented 5 months ago

@wilwade I thought we wanted to prevent this from getting in the block and using capacity. I do not see why we would only add this to pre_dispatch. I guess I am unclear about the problem we are trying to solve.

@wilwade I also don't think this is going to prevent nodes from gossiping about these type of transactions.

Can you please update the description of what problems we are trying to solve?

wilwade commented 4 months ago

I believe all the questions have been resolved and the details are now in the issue description. If this is incorrect, please post the list of open questions.

JoeCap08055 commented 4 months ago

Questions

Should this validation run on pre_dispatch as well as validate?

A: We want to do both

If pre_dispatch and validate, we should remove the check in the extrinsic

I don't think it would cost anything to keep the target_hash check in the extrinsic as well, as long as we're reading the storage anyway. At that point it only costs a string comparison.

frequency-chain / frequency