helium / sibyl

Apache License 2.0
4 stars 5 forks source link

Remove latency between reactivation check and validator heartbeat #57

Open BigEnigma opened 2 years ago

BigEnigma commented 2 years ago

Can the 360 block reactivation scan of a gateway (and probably the on subscription to POC stream) use the current age (hip17_interactivity_blocks) minus the heartbeat block frequency (validator_liveness_interval and maybe a bit extra for safety margin). This would allow for the reactivation to be pre-emptively picked up and then distributed in the heartbeat before the gateways actually become inactive.

The current scheme means each ~6 hours the scan takes place at which point the gateway is marked for reactivation. However this will only get picked up in the next heart beat transaction, so none of the other validators will reactivate these gateways for up to 100 blocks after the gateways have already reached the max age. So during this interval all the other validators handling a POC proposal will skip those gateways until the heartbeat comes around and reactivates them. I don't even think the validator setting the reactivation will do so until it reads back its own transaction (and I assume the validator will always absorbs its own txn but couldnt be sure...otherwise it would mean it wouldn't actually reactivate the gateways picked for reactivation).

BigEnigma commented 2 years ago

Actually for the on connection (POC subscription) could use the age minus the heartbeat frequency because it would get there in time. For the 6 hour scan that should be the age minus the combined heartbeat interval and scan interval, otherwise a gateway could expire right after the scan, wait an additional 360 blocks to get detected as needing reactivation and possibly wait another 100 blocks for the heartbeat to inform everyone.

BigEnigma commented 2 years ago

It seems the balance here is between when you ought to broadcast based on PoC rate, the chances of you not having done so requiring you to be kept alive BUT not doing that in a way which leaves you inactive when we can tell you are still connected. AND without overloading the network with continually having to heartbeat that reactivation information.

All that is then being balanced against how to know if and when a hotspot has really gone offline and therefore we dont want to be including them.

The bigger issue is still the latency between detecting you are inactive and reactivating a hotspot vs all the other elements if we consider the chances of hotspots going offline to be far less prevalent event. And once they are inactive they are out of the way, for this only reducing the overall inactive period would help.