Open rphmeier opened 2 years ago
This issue has been mentioned on Polkadot Forum. There might be relevant details there:
https://forum.polkadot.network/t/parachain-scaling-by-parablock-splitting/341/5
Summary of our discussion yesterday:
CollatorId
and then only allow that collator to connect - good DoS protection for parathreads.There are two possibilities:
Option 1 has a longer delay between claiming and actually using the slot, but is potentially easier to implement.
Option 2 is actually quite intriguing at first sight: It would behave rather similar to how normal transactions are processed: You send it to a validator, who puts it in a mempool and then picks what to validate, based on price. But:
CollatorId
s together with the PVF. Also the collator protocol would need to change significantly and would differ for parathreads and parachains, while with option 1, changes to the collator protocol will be fairly minimal. Hence from this high level perspective it seems that variant 1 will be way simpler to implement.
For option 1 above, we would provide some transaction the relay chain accepts to claim a slot. We should also provide an interface for submitting said transaction from Cumulus. Then based on that we will need to cooperate with SDK node team for implementing actual strategies for using it. E.g. claim once per day, claim once mempool is filled up to xx%, ... etc. Something ready to use for parathread developers.
For option 2 above, the actual pricing would also need to be determined on the Cumulus side, e.g.: No rush in authoring a block, start with a low fee, and ramp it up slowly on each validator group rotation until the block gets in. By monitoring the relay chain, the collator might also have an idea on current pricing.
- For registering the needed PVF, a deposit has to be provided. The opportunity costs pay for storage and messaging.
We already require a deposit or you want to increase the deposit?
Option 1 has a longer delay between claiming and actually using the slot, but is potentially easier to implement.
From what you have written, I also like this solution. We could integrate some sort of "min block on when to start bidding for a slot". After winning a slot, your slot would be X blocks in the future. I don't think that we need such a tight time between bidding for a slot and then actually building on this slot. If you have some application that needs to hit a very special slot, you could either bid a lot of dots or you could also just run a parachain and then being fully in control over your block production.
We already require a deposit or you want to increase the deposit?
No, just listing requirements.
For option 1, yep but no matter how much you bid you will have to wait at least a slot worth of time. We are not really concerned about this at this point, it is just a matter of fact and something to consider when picking options. Favorite so far is clearly option 1. A parathread should be fine to wait a bit for its slot.
Some more thoughts on asynchronous backing:
As already mentioned by @rphmeier we will need a runtime API that allows us to see into the future what parathreads will be scheduled on a core in the next blocks ahead. This is because also parathreads should be backable asynchronous, but this means they need to be able to provide a collation ahead of time.
This implies that the collator needs to know in advance that a core for its parathread is upcoming and the validators need to know, so they will be willing to accept the collation.
Flow will be something like this:
A
sees that it is scheduled on some core in X blocks ahead, where X is within the constraints of the maximum allowed depth. It produces a collation with the currently available relay parent.B
sees that it is scheduled on the same core in Y blocks ahead (Y constrained equally as X above), produces a collation.B
.
... possibly more parathreads.Now we have both those parathread candidates available in prospective parachains.
B
is produced. Block producer puts statements from prospective parachains on chain.A
.x
to be accepted assignments. (The oldest one).
-> Probably makes sense to unify with the lookahead, nothing really special about the "current" core assignment.Something to understand here, in general with asynchronous backing but here again relevant: There is the relay parent as specified by a candidate and there is the relay chain block where its relay chain child (next block) will have the candidate backed on chain. Those are different in asynchronous backing, which means:
2 explained in more detail: When a block producer wants to build a new relay chain block based on some block Z
, then this block is relevant to obtaining which statements of what parathread/parachain to put into the block. The relay parent of the candidates is irrelevant.
What makes this even more intertwined, is that the backing group to connect to is based on the core. So we are connecting to a particular backing group because as of our relay parent our ParaId is currently assigned to that core, even though it is possible that by the time our candidate gets actually backed on chain on that very core, some other group might already be assigned to that core. The reason is simple: We cannot reliably know when the candidate will actually get backed, so we have to use the core assignment based on the relay parent as that is fixed.
So the correspondence CoreId -> ParaId and CoreId -> Backing Group is evaluated in the context of different relay chain blocks for a single candidate.
This lookahead is mandatory for parathreads, but should be a general method even for parachains. This allows for greater flexibility for parachains as well and given that the core->paraid assignment is per block, it would also be more correct.
The lookahead should likely be used in the runtime itself as well, basically any parathread in the current lookahead view will be accepted (not only one). This would allow for better utilization: If a validator wants to produce a block, but it has not yet received enough statements for parathread A
, it can just put statements for another parathread B
in. So we basically would have a lookahead queue, where we prefer candiates at the top, but fallback to later candidates if available. If we fallback, we could leave the skipped candidate in the queue, but keep a record that it was skipped. If it was skipped/not provided n
times*) we remove it and just charge some fee for the failed attempt (see below).
So in a nutshell, instead of having a ParaId
assigned to a core as of a given relay chain state, we have a queue of ParaId
s and any of them is acceptable, but validators should prefer "older" entries. This way we don't waste block space, just because a parathread was a bit slow in providing a candidate. For a parachain that lookahead queue would all be the same ParaId
- so nothing changes for parachains.
Having validators being allowed to fall back to other candidates, would allow them to pick favorites. This could be disincentivized by rewarding provided top of the queue candidates more than ones further down. I don't see an actual security threat as eventually an honest block producer will pick up the candidate (at least after some retrying). Also without fallbacks, a block producer can also always decide to just not put statements in, so from a censorship perspective nothing changes.
With the fallback mechanism, the success rate of parathread backing should be increased and in any case makes sure block space is utilized, but at least eventually we have to remove the ParaId
from the queue.
Reasons for a candidate not making it in:
Both backing group and block producers are punished by losing out on rewards. We should also charge the parathread a fee for a missed scheduling attempt, likely less than for a successful one, but enough to disincentivize on purpose spamming.
This is not really "fair" as always two parties get punished for the misbehavior of a third, but with a low enough "punishment", this is just bad luck and will even itself out over time.
In any case if a parathread block does not make it through in time, the parathread can just issue another transaction and try again, just as for the first attempt. It will likely get assigned a different backing group and different block producers.
*) We could also just punt on the parathread. You missed it - out, try again. In collator protocol we should be able to prioritize fetching for candidates that are earlier down the road, so indeed that complication is likely not necessary and we should just remove a ParaId
from the queue if no candidate was provided.
Yes, as you point out, groups can be "assigned" to multiple cores per relay-parent once exotic scheduling is live. We will need to handle this in a few places, but with the right runtime API the code in the collator-protocol and statement-distribution should Just Work.
Ordering of candidates in Asynchronous Backing is an open problem. There are other issues where the relay-chain block author can ignore a long prospective chain and include an alternative short chain in the relay-chain block, leading to wasted work. The cost is non-zero as disregarding work means doing more work in the future and potentially missing out on era points.
Most of the changes described here, with respect to the choice of which candidates to second, would live in the backing subsystem.
Both backing group and block producers are punished by losing out on rewards. We should also charge the parathread a fee for a missed scheduling attempt, likely less than for a successful one, but enough to disincentivize on purpose spamming.
This is not really "fair" as always two parties get punished for the misbehavior of a third, but with a low enough "punishment", this is just bad luck and will even itself out over time.
We might approach this from the other angle: the user has paid for blockspace whether or not they utilize it, so we could simply give validators rewards when no candidate is backed as well as when a candidate is backed, although much smaller ones. I think this is equivalent, but it is also reasonable to provide a deposit which is refunded if a block actually gets backed (although, critically, not gated on the parablock becoming available)
Yes, as you point out, groups can be "assigned" to multiple cores per relay-parent once exotic scheduling is live. We will need to handle this in a few places, but with the right runtime API the code in the collator-protocol and statement-distribution should Just Work.
Just read again and realized, I don't really understand this paragraph: Minor thing, I think you meant async backing and not exotic scheduling and the other: groups can be assigned to multiple cores at a single point in time, but still only one per relay parent - it is just that more relay parents are considered/valid at any given point in time.
Re-iterating for me and others, with ASCII graphic. We will have a claim queue per core with ParaId
entries:
Some claim queue as of relay chain block B1:
| Para1 | Para2 | Para3 |
Collators providing a collation with relay parent B1, will be allowed to do so for Para1, Para2 and Para3: The assigned backing group is expected to accept collations for all those based on B1. It will also have other relay chain blocks in its implicit view, e.g. the ancestor of B1, B0. Assuming that B0 is also within the rotation boundaries, the backing group will also accept collations for the claim queue of B0, if those collations have B0 as their relay parent.
When authoriing a block, any backing a para in the claim queue will be accepted, but earlier entries are preferred. Assuming Para1 gets backed and included, The scheduler will update the claim queue: Para1 gets dropped and some Para4 gets pushed back, new claim queue:
| Para2 | Para3 | Para4 |
The size of the claim queue should be configured related to the maximum depth of a candidate relay parent. E.g. if we only allow a depth of 2 (maximum parent of current leaf), claim queue size larger than 2 makes little sense, even if we assume 6s block times, as the candidate would no longer be valid by the time it could get backed on chain.
The para ids in a claim queue don't necessarily have to differ, e.g.:
| Para1 | Para1 | Para1 |
would also be a perfectly valid claim queue, in fact for parachains they will look exactly like this. Result is normal asynchronous backing behaviour: A single parachain can prepare multiple candidates ahead of time.
Claim queues of adjacent relay chain blocks will normally have an overlap. Especially if backing groups accept collations for older (up to max depth) relay parents, they should keep track of the most current claim queue and in general should consider information about already received candidates and chain state - e.g. candidates that are pending availability to avoid wasting work on candidates that cannot possibly make it.
For considering chain state on backed candidates, the claim queue will likely also have a state for each queued item like "scheduled", "occupied" ... similar to availability cores we have now. The old availability cores mechanism are basically claim queues of size 1.
The size of the claim queue should be configured related to the maximum depth of a candidate relay parent. E.g. if we only allow a depth of 2 (maximum parent of current leaf), claim queue size larger than 2 makes little sense, even if we assume 6s block times, as the candidate would no longer be valid by the time it could get backed on chain.
Even if we would have a maximum depth of 1 (like we currently have), the claim queue should be bigger IMO. I could imagine collators may use this time before building to fetch relevant transaction or already be able to prepare certain computations. There is no hard connection between the relay parent and the transactions a Parachain can include. They could maybe optimize for first applying all the transactions that don't need any information about the relay chain and then in the "last second" they push the relay chain block (they are building on) to the runtime followed by the transaction that are requiring these information.
Justifies maybe another state: "Upcoming". Those collations would not yet be accepted, but the collator could start preparation work.
Maybe I misunderstood you, I thought that the "claim queue" is the order in which a core is given out to certain collator/parathread combinations based on them winning the slots in this queue?
More or less. It is the upcoming core assignments (ParaIds) for a core. It is relevant to collators, so they know when they are supposed to produce a block and relevant to validators, so they know what collations to accept. Indeed the claim queue for the parathread cores is the result of successful orders.
Upcoming
would then be a special state, that is just a heads up for collators that they are coming up, but validators would not yet accept such collations. But honestly, I don't think we will really need this with async backing, as then the claim queue will be larger 1, hence you know what is upcoming anyway.
Upcoming
would then be a special state, that is just a heads up for collators that they are coming up
Can they not just find this out by inspecting at which point in the queue they are? Why do we require some special state for this?
Minor thing, I think you meant async backing and not exotic scheduling and the other: groups can be assigned to multiple cores at a single point in time, but still only one per relay parent - it is just that more relay parents are considered/valid at any given point in time.
Looking back over it, I think you are right, although, I also think I was alluding to the fact that we can have the same group assigned to multiple parachains at the same time, even on the same core.
Collators providing a collation with relay parent B1, will be allowed to do so for Para1, Para2 and Para3: The assigned backing group is expected to accept collations for all those based on B1. It will also have other relay chain blocks in its implicit view, e.g. the ancestor of B1, B0. Assuming that B0 is also within the rotation boundaries, the backing group will also accept collations for the claim queue of B0, if those collations have B0 as their relay parent.
Do you mean that backing validators at B1 should accept candidates for all parathreads in the claims queue for their core with relay-parent B1? I'll assume so going forward, as that's my best reading of the text here.
The size of the claim queue should be configured related to the maximum depth of a candidate relay parent. E.g. if we only allow a depth of 2 (maximum parent of current leaf), claim queue size larger than 2 makes little sense, even if we assume 6s block times, as the candidate would no longer be valid by the time it could get backed on chain.
I think this is accurate w.r.t. the part of the queue that could be considered by backing groups, but i don't understand why we would constrain the size of the entire claim queue by this as opposed to just constraining the size of the prefix of the claim queue that backing validators should consider at any point.
Q: why shouldn't the queue have 100 items, if backing validators only have to deal with max-depth at most? A: if parathread claims commit to specific candidate hashes, this is a hard requirement, as the candidate hash commits to the relay-parent. Otherwise, there is no technical reason.
Q: why should parathread claims commit to specific candidate hashes? A: ?
The para ids in a claim queue don't necessarily have to differ, e.g.:
| Para1 | Para1 | Para1 |
would also be a perfectly valid claim queue, in fact for parachains they will look exactly like this. Result is normal asynchronous backing behaviour: A single parachain can prepare multiple candidates ahead of time.
Is the plan to refactor parachains to use claim queues like this as well? or does "exactly like this" mean something different?
The para ids in a claim queue don't necessarily have to differ,
Agreed, however, we should clarify that a para ID may only exist within a single claim queue at a time. At least until we have sequence numbers in candidate receipts, which will then let us store (ParaId, SequenceNumber)
in claim queues.
Do you mean that backing validators at B1 should accept candidates for all parathreads in the claims queue for their core with relay-parent B1? I'll assume so going forward, as that's my best reading of the text here.
Yes, exactly.
I think this is accurate w.r.t. the part of the queue that could be considered by backing groups, but i don't understand why we would constrain the size of the entire claim queue by this as opposed to just constraining the size of the prefix of the claim queue that backing validators should consider at any point.
This is kind of equivalent to the the suggested third state "Upcoming". It is possible to expose more via either mechanism and we can if there is a need. I am not sure there is though. On the flip side the claim queue is a contract on what is coming up and it is not supposed to change, apart from revealing more entries on the end and popping the front on timeout/availability. The longer the exposed queue (with this guarantee) the less flexibility, e.g. pushing back a ParaId
when it timed out on availability.
We could relax the contract for entries after the prefix, saying "we think they will come up on this order, but this might change" ... but that seems a bit moot. On the other hand the queue size also influences assignment providers, which might not even be able to reveal the next 100 items already. (because they are still discovered, orders coming in ...)
Q: why should parathread claims commit to specific candidate hashes?
They don't, this would severely complicate the implementation, I think - and I don't see why we would want that.
Is the plan to refactor parachains to use claim queues like this as well? or does "exactly like this" mean something different?
Yes. The core abstraction gains a dimension, no longer of dimension 0, but dimension 1 - called claim queue now. My current thinking is that after the assignment provider there is no difference between parathreads and parachains. Validator node side and even collators don't care (apart from the separate part where they send bid extrinsics): They see their assignment coming up and produce a block, they don't care whether this happens all the time (parachain) or only once in a while (parathread).
Can they not just find this out by inspecting at which point in the queue they are? Why do we require some special state for this?
We would like to meaningfully limit the number of collations a validator has to assume valid and needs to accept at a given point in time. If a validator accepted collations that come later up in the queue, it would be wasting resources as such a collation if provided has no chance of making it into a block before it becomes invalid - hence no point in accepting it in the first place. If exposed at all, either another state which just shows "coming up, get ready" - or as Rob suggested specify some prefix length, with the same effect.
But again, I don't see why we would need either (once async backing landed).
We seem to have a communication issue. What we agree on is
The miscommunication is that I am only saying that the exposed claim queue via Runtime API should encompass (3) but that this does not need to limit the general size of the claim queue. That's all; I'm not suggesting that far-in-the-future stuff be exposed to or acted on to validators, just that the claims may be stored with further lookahead on the relay chain. Maybe a moot point.
Full agreement on point 1 and 2, for point 3: How I envision it to work as of now, is the claim queue is managed by the scheduler and includes as many elements per core as we actually need. If the first entry becomes available/candidate gets included, we remove it from the front, pop from the assignment provider to fill it up again (push back onto the claim queue). The assignment provider, in case of parathreads will have its own order queue which it uses to provide those assignments when "popped".
Is this maybe part of the communication issue? We do have another queue - on the side of the parathread assignment provider. We might want to expose that one as well, if that is useful.
We also could have a longer claim queue within the relay chain runtime than what we actually expose to validators for backing, this might be useful or even necessary. But I am not sure this is what you meant.
The assignment provider, in case of parathreads will have its own order queue which it uses to provide those assignments when "popped".
This is what I was missing. This all makes sense to me now. Thanks for elaborating
This issue has been mentioned on Polkadot Forum. There might be relevant details there:
https://forum.polkadot.network/t/on-demand-parachains/2208/1
Wasm Execution Risks: related to https://github.com/paritytech/polkadot-sdk/issues/990 ; if there are underlying issues with PVF execution that we aren't aware of, it'll be far cheaper and easier to exploit them on parathreads than it is on parachains.
This is a separate issue to potential sources of non-determinism and is more of an implementation issue rather than a design issue, but considering today's security vulnerability in wasmtime
where a really nasty severity vulnerability was found (which can lead to remote code execution!) I think more security hardening (https://github.com/paritytech/polkadot-sdk/issues/882) should be a hard blocker before we actually enable parathreads on any value bearing chain. I'd like to have at least a full seccomp jail for the PVF host process.
Option 2 is actually quite intriguing at first sight: It would behave rather similar to how normal transactions are processed: You send it to a validator, who puts it in a mempool and then picks what to validate, based on price. But:
I'd like to better understand the issue here.
DoS protection becomes harder, no simple checking of collator id would be possible, except if we registered CollatorIds together with the PVF.
We've parathread state on the relay chain. It'd contain collator ids, like any other parachain presumably does, no?
Also the collator protocol would need to change significantly and would differ for parathreads and parachains, while with option 1, changes to the collator protocol will be fairly minimal.
It's true this approach opens up the collator protocol, not merely tacks something onto the front end, yes. I guess this is the concern.
Validator groups rotate, hence if your advertisement is not processed within the rotation time, you would need to resend it to the next group. Probably not a big deal though.
Identical issue with other parachains, no?
Actually needed price is not known in advance, more complication for the Cumulus side - see below.
Yes, but we decided we'd have the price known in advance anyways, no?
The "included" fee, would also need to go to the relay chain in some way. Most likely as a separate transaction that needs to be sent, similar to variant 1 or we would adjust backing statements to include it. The separate transaction is risky for the validator and racy. If the transaction is not recorded before the backing statement, the chain would have to treat the backing statements as invalid. So it kind of has to go into the backing statement, which is another protocol change.
Ain't too hard to have prepayments, ala https://github.com/paritytech/cumulus/issues/2154#issuecomment-1687854740
We've parathread state on the relay chain. It'd contain collator ids, like any other parachain presumably does, no?
no we don't do that.
Identical issue with other parachains, no?
No it is quite different. In the described model there is no core scheduled for the on-demand chain yet, hence we don't know whether there are resources available right now. While with normal parachains they have a scheduled core: They can realistically assume their collation to be processed by the backing group they advertised it to.
With the probabilistic scheduling @rphmeier is pivoting to now, things are changing though and it might make sense at some point to re-open decisions made on how we want to do on-demand.
Yes, but we decided we'd have the price known in advance anyways, no?
Not really, no. Price is adjusting all the time, based on demand.
Ain't too hard to have prepayments, ala https://github.com/paritytech/cumulus/issues/2154#issuecomment-1687854740
Would be pretty hard, requiring substantial changes .. but they go in a similar direction Rob is moving anyway. The main benefit would be that the parachain itself could order its cores, instead of having to refund collators - right? Downside is, more work for the relay chain as we have to keep stuff around for potentially a very long time, also Gav was concerned about people buying cores/resources when they are cheap and then hoarding them to use later.
As I said elsewhere, I'm happy with a KISS approach here: We could deploy one simple scheme, but tell those parathreads they should upgrade to being full parachains if they want more flexibility. We could later build a cleanly abstracted scheme if & when we want parathreads to have more flexibility. Avoid too much investment in parathreads before we learn more about their usage, handling, etc.
Based on the original work done by @gavofyork in 2019: #341
Board: https://github.com/orgs/paritytech/projects/67 Feature branch: https://github.com/paritytech/polkadot/pull/6969
Background and Motivation
Polkadot currently only supports parachains, running leases of between 6-24 months. Kusama leases run from either 6 weeks to 48 weeks. If a chain has a slot and loses one, it simply stops running. If a project can't afford a slot due to limited supply or intense competition, it simply doesn't get off the ground.
Parathreads are pay-as-you-go parachains: they pay for security on a block-by-block basis as opposed to leasing an entire core for a prolonged period of time. Parathreads only differ from parachains in terms of how they are scheduled, collated, and backed. Availability, approval-checking, and disputes all function the exact same way.
Parathreads provide an on-ramp and off-ramp for projects, as well as the option for chains which produce blocks only occasionally. They provide another class of offering in the market for shared security which will lead to better pricing of security for longer leases as well.
The idea of parathreads is to allocate a number of parachain availability cores specifically to parathreads, and these parathread cores multiplex blocks from a backing queue of 'claims', which represent upcoming parathread blocks. The queue of claims is populated by an auction process which runs in every or almost every relay chain block, allowing collators to bid in the relay chain's native token in order to earn a claim. Claims are processed in order of submission, or close to it. Each claim is associated with the specific collator.
The parachain scheduler should attempt to schedule pending parathread claims onto available parathread cores. Once scheduled, parachain backers should connect to the specific collator mentioned in the claim and get the candidate and PoV from them.
Design Considerations
Spam-resistant Auctions: Bids from collators who don't win auctions should appear on-chain in some form so that useless bids still have some cost associated with them, meaning that spam isn't free.
Stacked Claims: It should be possible for multiple claims from a single parachain to be present, and for some kind of dependency relationship to be expressed between them. This will allow parathreads to have fast blocks during bursty periods if they so desire.
Data Unavailability: Polkadot only keeps data available for 24 hours. This means that it is possible that the data proving a prior block might be lost if the last block was produced more than 24 hours ago. In the past, we've discussed forcing parathreads to author blocks at least once every 24 hours. This probably isn't necessary - parathread clients should just make sure that they're fully syncing within 24 hours of the last block and gathering data from the data availability system if necessary.
Wasm Execution Risks: related to https://github.com/paritytech/polkadot-sdk/issues/990 ; if there are underlying issues with PVF execution that we aren't aware of, it'll be far cheaper and easier to exploit them on parathreads than it is on parachains.
Asynchronous Backing: We should expect parathread blocks to be built 12-30s ahead of time and accordingly for validators to know that the parathread will or might be scheduled 12-30s ahead of time. The claim should probably not commit to the candidate hash, or if it does, claims will have to be retired across sessions.
Leniency and Censorship: When a claim is scheduled, it's possible that it's either not fulfilled because of the collator not producing a block or because the backers haven't managed to back the block in time. Claims should stay scheduled for a few relay chain blocks, and potentially should be scheduled onto cores where other backing groups are assigned.
Runtime
Claims queue: Manages all pending parathread claims.
Auction Mechanism: for pushing onto the claims queue
Runtime API: for informing validators of scheduled or soon-to-be-scheduled parathreads
Node-side
Collator networking: Connecting to the specific collator and/or accepting connections from specific collators Prospective parachains: detecting scheduled parathreads as well as parachains
Collator-side and Parachain Side
Auction Bidder: logic to control a hot wallet to participate in auctions under configurable conditions. We should expect some sidecar process to run alongside the node and take into account information that's not easily generalized: for example, users might create strategies incorporating the relay-chain token price, the parachain token price, unconfirmed transaction rewards, and the block reward mechanism of the parathread. Parathread Consensus: logic to detect when a block should be authored based on the state of the claims queue / scheduled cores