spacemeshos / pm

Project management. Meta-tasks related to research, dev, and specs for the Spacemesh protocol and infrastructure.
http://spacemesh.io/
Creative Commons Zero v1.0 Universal
2 stars 0 forks source link

Phased PoET servers #262

Open lrettig opened 7 months ago

lrettig commented 7 months ago

Rather than all PoET servers running in identical phase, as discussed many times it would be nice if there were many PoETs running with different phases so that, e.g., missing a PoET registration window wouldn't be so dire since you could just register with the subsequent PoET. This will require changes in node logic, as well, as the PoET registration/proof retrieval logic is currently pretty simplistic.

Related: #215, #257

dshulyak commented 7 months ago

it was raised that phased poets is likely waste of time at this point, for the most part i agree, as originally they were meant to distribute atxs gossip over the epoch. this is so far wasn't an issue.

however they are important if we want to extend effective time for post generation. for example with early exits smesher can spend 36h for post proving, and then join earlier poet by exiting early from late poet.

poszu commented 7 months ago

they are important if we want to extend effective time for post generation. for example with https://github.com/spacemeshos/pm/issues/268 smesher can spend 36h for post proving, and then join earlier poet by exiting early from late poet.

What would be the logic to pick a poet exit (early/normal)? I think that the reasonable and safe choice would be to always pick the early exit and move toward the earlier phase to have more time in case of failures or POST taking longer than expected etc.

With that in mind, I have a different idea to solve this problem in an easier way that requires little changes and doesn't require early exits. The idea is to have many poets with increasing phases and cycle gaps. They would end at precisely the exact moment and start one after another (see the image below). Smeshers whose POST generation takes longer would register with a later poet (for example smesher doing POST for 30h registers with poet C in the picture).

:bulb: Implementing this: https://github.com/spacemeshos/poet/issues/351 would make the poet selection logic in node super simple (there would be no logic required - just submit to all).

We could even have one (could be slower) poet with a big phase shift to make network entry smoother (currently unlucky miners need to wait up to 13.5 days).

image

dshulyak commented 7 months ago

What would be the logic to pick a poet exit (early/normal)? I think that the reasonable and safe choice would be to always pick the early exit and move toward the earlier phase to have more time in case of failures or POST taking longer than expected etc.

i had the same idea. smesher picks a cluster of poets with cycle gaps one after the other, and then tries to stick to the earliest poet. and it exits early from the later poet and loses some ticks in case of slippage. if we limit the purpose of phased poets only to solving this specific issue then your solution is sufficient and simpler.

originally phased poets were also meant to be used for spreading atxs load over the whole epoch. it is not clear yet if thats a significant concern and maybe we won't need phased poets for that particular problem. when we will data that indicates bottlenecks in that are we can also consider alternative solutions, such as more restricted queues.

poszu commented 7 months ago

originally phased poets were also meant to be used for spreading atxs load over the whole epoch.

This would probably require some incentives for the smeshers to naturally drift into "the less occupied phases" and would conflict with the idea of having a Post Service working for particular IDs at particular periods for efficient usage (unless users would hard-pick (ID; phase) pairs, but IDK how that would help to spread ATXes... ).

Should we proceed with this approach for now?

dshulyak commented 7 months ago

This would probably require some incentives for the smeshers to naturally drift into "the less occupied phases"

there was an incentive plan as well. if we would go that route my preferred approach would be to pick phase randomly and make it work for the end user. sure some will want to micro-optimize for gaining a bit more ticks once, but most won't care. and if my naive plan won't work then we can add incentive.

Should we proceed with this approach for now?

your approach sounds good to me