Title: PM double spend security
Author: @yondonfu

Brief Description

PM (probabilistic micropayments) is a useful scheme for many-to-many payments with a single funding deposit enabled by universal payment aggregation - as long as the selection process for winning tickets is provably fair, a recipient does not care that a ticket from any individual sender does not pay out because on average it will be paid fairly over the course of many tickets from many senders. Aside from UX concerns, the core challenge to be addressed with PM is defending against double spends which cannot be completely prevented because senders have a single deposit for multiple recipients and payments are offline (meaning that parties do not wait for transactions to confirm on-chain thereby ensuring global consensus on availability of funds).

One approach to deter double spends is to force senders to also have a non-spendable penalty escrow that backs created tickets (in addition to the sender’s funding deposit) as is suggested in [1]. If the sender is caught double spending, its penalty escrow is slashed. As a result, a sender would lack an economic incentive to double spend if the additional utility it can gain is less than the economic loss incurred by having its penalty escrow slashed. However, [1] does not specify how exactly the required penalty escrow should be set.

The problem we are interested in solving is how to set a required penalty escrow for a sender that would deter the sender from double spending. We proceed to discuss the specific requirements for a PM double spend security protocol that involves setting a required penalty escrow below.

Problem Statement

Use Cases

A PM double spend security protocol must fulfill or preserve the following use cases:

A recipient should be able to safely accept payments for transcoding work from an anonymous zero reputation sender with minimal risk
A recipient should be able to accept simultaneous payments from many senders
A sender should be able to simultaneously pay many recipients
A sender should be able to pay any recipient regardless of whether the sender has worked with the recipient in the past before
A sender should be able to stop paying a recipient and quickly transition to paying a new recipient

Requirements

Bounded additional utility from double spending: Since we acknowledge that completely preventing double spending is impossible, the security mechanism needs to be able to bound the sender’s additional utility from double spending. If the sender’s utility from double spending is infinity then there is no cost that can be imposed on the sender that would deter double spending.
Minimal restrictions on sender set for a recipient: Ideally, there is no restriction on the senders that a recipient can accept payments from. If there needs to be a restriction on the sender set, the recipient should be able to modify its sender set with minimal overhead (cost in both time and money).
Minimal restrictions on recipient set for a sender: Ideally, there is no restriction on the recipients that a sender can send payments to. If there needs to be a restriction on the recipient set, the sender should be able to modify its recipient set with minimal overhead (cost in both time and money).
Sub-linear cost of scaling sender payments to additional recipients: Any on-chain transactions required should not scale linearly with the number of desired recipients because otherwise, the sender would be better off setting up a unidirectional payment channel with each recipient (the channel set up cost for each recipient is what we are trying to avoid).
Minimal switching overhead to work with new marketplace entrant: In an extreme scenario, all the current recipients in the marketplace might be faulty or charging high prices. So, if a new marketplace entrant is either performant or charging a lower price that satisfies the sender's requirements, the sender would prefer to send payments to the new entrant. The overhead (cost in both time and money) for switching to the new entrant should be low.

Reasonable Assumptions

Rational actors: Senders and recipients are rational such that they will not take an action where the economic cost > the economic benefit.
Free entry and exit in the marketplace: Both parties are allowed to enter and exit the marketplace as they please. There should be no expectation of a fixed set of senders or a fixed set of recipients. The membership of both sets should be expected to be constantly changing over time.
Blockchain availability: For the purposes of this project, we assume that the underlying blockchain is available and that actors are able to have their transactions confirmed within fixed time windows.

Candidate Constructions

Recipient set commitments and payment rate limiting

Chiesa et. al. have presented an anonymous probabilistic payment construction informed by a formal economic analysis of PM double spending [2]. We can try to adapt their construction and leave out the technical details required for anonymity guarantees.

In the context of the Livepeer network, the sender is a broadcaster and the recipient is an orchestrator.

Orchestrators register themselves on-chain and advertise their serviceURI, rewardCut, feeShare and maxPayRate within the on-chain registry. Of particular relevance to the current discussion is maxPayRate which defines the maximum cumulative value of PM tickets (i.e. the sum of the expected value of tickets) that an orchestrator will accept during a locally defined time period payWindow denominated in blocks. We use the Ethereum blockchain as a universal clock that ticks on average every 15 seconds with each new block. An orchestrator will maintain a local counter to track the value of tickets received which is reset at the beginning of each payWindow. If the counter value is ever going to exceed maxPayRate, the orchestrator stops accepting tickets until the beginning of the next payWindow. Thus, an orchestrator rate limits payments accepted during a fixed time period. The default value for payWindow should be the expected time to detect double spends which might be a few blocks if orchestrators default to redeeming winning tickets immediately. We define the time to double spend detection to start from when a broadcaster’s deposit is insufficient to cover all of its sent out outstanding winning tickets and to end when the last orchestrator to receive one of those winning tickets detects the double spend.

When a broadcaster creates its penalty escrow, it first observes the on-chain registry and fetches the ETH addresses and maxPayRate values of the orchestrators that it would like to work with. The broadcaster can select the entire orchestrator set or it can select a subset to reduce its penalty escrow requirements. This might be preceded by requests to also fetch pricing information from the orchestrators. Then, the broadcaster generates a Merkle sum tree (the use of this data structure is similar to its usage in the proofs of liability described in this paper) with each leaf as a ETH address and maxPayRate pair. At the end of the generation process, the broadcaster should have a Merkle root and the sum total of the leaf maxPayRate values. The broadcaster should then set its penalty escrow equal to the sum total of the leaf maxPayRate values and store the Merkle root on-chain. We call this Merkle root the broadcaster’s receipientCommitment.

When a broadcaster works with an orchestrator, the broadcaster must prove to the orchestrator that it was included in the broadcaster’s recipientCommitment:

Given a Merkle authentication path (tree nodes from leaf to root), the orchestrator can confirm that its ETH address was included in the tree corresponding to a broadcaster's on-chain recipientCommitment.
A valid authentication path will also mean that the orchestrator’s maxPayRate added with the node values in the path result in the broadcaster’s current penalty escrow. We can relax this condition to instead require the broadcaster’s current penalty escrow to be greater than or equal to the resulting sum of adding the values in the authentication path. Thus, if the orchestrator decreases its maxPayRate, the broadcaster doesn’t have to update its recipientCommitment
If we assume that orchestrators will only work with a broadcaster upon receiving a valid Merkle authentication path for the broadcaster’s current recipientCommitment, then any given orchestrator knows that if n orchestrators work with a broadcaster then the broadcaster’s penalty escrow is sufficient for the maxPayRate values defined by those n orchestrators.

To summarize, by having orchestrators enforce their own payment rate limits based on their own time windows for double spend detection and having broadcasters commit to a recipient set ahead of time, this construction bounds the time to double spend detection, recipient set and recipient payment rates which results in bounded additional utility from double spending for broadcasters. The required penalty escrow for a broadcaster can then be set to this maximum additional utility from double spending.

Open Problems/Questions

Should orchestrators be able to update their maxPayRate values? If so, we might want to restrict the time period during which these values can be updated on-chain (perhaps during a round lock period) so that broadcasters have an easier time generating a recipientCommitment and adjusting their penalty escrow with fixed maxPayRate values.
Should maxPayRate be based on the expected value of all received tickets or the face value of received winning tickets (or should both be rate limited)? The latter is the safest in the worst case where a malicious sender is really lucky and creates many winning tickets simultaneously, but increases penalty escrow requirements while the former just focuses on the average case.
Is it ok for orchestrators to define their own payWindow values locally?
Are there serious vulnerabilities to collusion between malicious broadcasters and orchestrators to harm other orchestrators by deviating from the described protocol?

Objectives & Key Results

Objective 1: Design a PM double spend security protocol that fulfills the above requirements.

KR1: Produce a technical specification for the protocol
KR2: Produce a written explanation of why the protocol is secure under the above assumptions

If Objective 1 is believed to be impossible, then consider:

Objective 2: Prove that a PM double spend security protocol that fulfills the above requirements does not exist.

KR1: Produce a written explanation of why such a protocol under the above assumptions does not exist
KR2: Produce a suggestion for an alternate objective based on updated requirements and assumptions

When a "written explanation" is mentioned, a logically constructed argument will usually suffice. Ideally, a rigorous proof (i.e. mathematical) would be provided. However, since not everyone is trained in writing those types of proofs and in the interest of promoting more open participation, the former should be adequate.

Suggested Reading

[1] Pass & Shelat. “Micropayments for Decentralized Currencies”. (paper) [2] Chiesa et. al. “Decentralized Anonymous Micropayments”. (paper) [3] Micali et. al. "Micropayments Revisited". (paper) [4] Salamon et. al. "Orchid: Enabling Decentralized Network Formation and Probabilistic Micro-Payments". (paper)

If I'm understanding correctly, imagine we have 500 nodes on the network, the payWindow is 10 blocks, and average maxPayRate is $0.15 (0.4166hr $0.30 10streams). This means the escrow deposit is $75 to cover the whole network?

If I'm an O who doesn't respect maxPayRate (keep accepting PM tickets after maxPayRate is reached), does it raise the risk for everyone else?

Also, just pointing out that, if the network is under heavy utilization, it might take B a while to find a O that's not at capacity. This might be exacerbate by the fact that B has to pre-determine a subset of Os to work with. (But this is a moot point if we assume supply always outstrips demand on the network)

Still digesting this, but some quick questions:

Does maxPayRate need to be on chain? Or could this be something that's gathered off-chain during the O Discovery stage?
How should maxPayRate be set? Is there a relation to EV / faceValue?
Can you elaborate on the payWindow? Is it meant to be a signal as to the work that B may also be sending to other Os? Does this also depend on knowing the approximate EV for other Os?
Is the entire escrow slashed (burned or rewarded?) if a double spend happens? Do the Os get any part of the escrow?
How frequently would re-commitments happen?

Some other thoughts:

"Double spends" may not be intentional by the B; it could just be a broadcaster's deposit going to zero as several Os try to claim (perhaps those that ignore maxPayRate as @ericxtang suggested).
B would have to regularly post on-chain updates as the network changes (eg, Os raise their maxPay, new Os come online, Os go offline, etc). This "maintenance" increases friction in a few ways: B needs to determine which Os to work with and pay the gas to post the commitment.
Feels like Os would be conservative about setting maxPay (since they can always reduce it, but increasing it is harder.) However, the price volatility of Ethereum may make it difficult to "set and forget" a maxPay value.
Os new to the network now have the additional hurdle of being "committed" by a B. It's not clear how frequently Bs will refresh their commitments, and Os have little visibility into which commitments they hold. This is reminiscent of some of the challenges around state channels.

I wonder if we could reframe the escrow as a "reserve" that Os can draw back on. Under this scenario, as B's deposit falls below the reserve, Os are guaranteed to receive at most maxPay, perhaps by submitting a winning ticket along with B's Merkle proof. This makes it less punitive for Bs, accommodating to over-drawing (whether intentional or not), and essentially eliminates O's risk of non-payment up to maxPay. Of course, there are lots of details to consider here (eg, "double dipping" the reserve, when to reset "dippability").

If I'm understanding correctly, imagine we have 500 nodes on the network, the payWindow is 10 blocks, and average maxPayRate is $0.15 (0.4166hr $0.30 10streams). This means the escrow deposit is $75 to cover the whole network?

The steps for calculating the required escrow is right, but I think the math in this specific example is a little off:

hrs = ((10 * 15) / 60) / 60 = 0.0416 # Assuming ~15s blocks
maxPayRate = hrs * .3 * 10 = .1248
escrow = maxPayRate * 500 = 62.4

The calculation for maxPayRate could take as inputs O's max price per input segment based upon all its supported renditions (treat $.30/hr in the example as the value of a parameter) and its maximum # of concurrent streams supported for a broadcaster (treat 10 streams in the example as the value of a parameter).

If I'm an O who doesn't respect maxPayRate (keep accepting PM tickets after maxPayRate is reached), does it raise the risk for everyone else?

I think it does increase the chance of a double spend happening so all receiving orchestrators are negatively impacted. However, I also think the orchestrator that deviates from the protocol is worse off relative to the other orchestrators and does not have an incentive to deviate. Find below my current thinking.

Consider an honest O (defined as following the protocol) that declares a maxPayRate and never accepts tickets past this rate limit. Then, given a valid Merkle authentication path for a B's recipientCommitment, O knows that B's penalty escrow is at least sufficient to back the tickets O receives up to its maxPayRate. So, as long as O follows the protocol, O knows that if B double spends, it will lose maxPayRate of its escrow (an eye for an eye - B loses at least as much as O). O also knows that if n other orchestrators follow the protocol, then B's penalty escrow is at least sufficient to back the tickets of these orchestrators up to their respective maxPayRate values. However, O does not know what n is. As a result, O has no guarantee that B's additional utility from double spending is bounded - if n is less than the total # of orchestrators included in B's recipientCommitment, then it is possible for B's additional utility from double spending to exceed its penalty escrow. Thus, the only guarantee that O has without making any assumptions about the number of honest actors following the protocol is that B will lose as much as O in the event of a double spend.

Now, consider a byzantine O' (defined as deviating from the protocol) that declares a maxPayRate, but then accepts tickets from the same B as above past this rate limit. This behavior results in n (the # of orchestrators following the protocol) being less than the # of orchestrators included in B's recipientCommitment. However, this fact does not change the original guarantee that the honest O had - it is still the case that B will lose as much as O in the event of a double spend.

That being said, in this scenario, the byzantine O' would allow B's potential additional utility from double spending to exceed its penalty escrow. As a result, B could have an incentive to double spend which is a negative social outcome for all orchestrators that could be receiving tickets from B.

Given that a B with an incentive to double spend is bad for all orchestrators, why would an individual orchestrator deviate from the protocol? Perhaps the orchestrator wants to circumvent the rate limit restrictions immediately and does not want to wait until it can modify its previously declared maxPayRate.

Let's assume that if B's additional utility from double spending > its penalty escrow, then B will double spend. Let's say that the byzantine O' receives 5 winning tickets with face value V and win probability of p after breaching its maxPayRate limit. If B was not double spending and all these tickets successfully pay out, then O' has a utility of 5V - 5pV (we subtract 5pV to account for the transcoding work performed). However, based on our initial assumption, B is double spending in this case and none of these tickets paid out. As a result, O' has a utility of -5pV. Furthermore, since O' accepts tickets past its declared rate limit, O' does not know whether B's penalty escrow is at least sufficient to back to the total value of tickets O' receives until a double spend is detected. So in this case, O' also does not have the benefit of knowing that B will lose as much as O' in the event of a double spend.

Thus, I don't think an orchestrator has an incentive to deviate from the protocol.

Also, just pointing out that, if the network is under heavy utilization, it might take B a while to find a O that's not at capacity. This might be exacerbate by the fact that B has to pre-determine a subset of Os to work with.

Pre-determining a subset of orchestrators to work with is definitely a con with the candidate construction. At the moment, I think it might be unavoidable though if we want to methodically compute a required penalty escrow that can deter double spends.

Does maxPayRate need to be on chain? Or could this be something that's gathered off-chain during the O Discovery stage?

Yeah on second thought I do think maxPayRate can just be fetched off-chain.

How should maxPayRate be set? Is there a relation to EV / faceValue?

Perhaps based on the maximum # of concurrent streams for a single broadcaster, the length of the payWindow and the maximum number of supported renditions?

Brief sketch with random values:

payWindow = 2 blocks = 30 seconds = .008 hours
pricePerHour (5 renditions for a single input segment) = $.3
maxConcurrentStreams = 10
maxPayRate = .008 * .3 * 10 = $.024

Can you elaborate on the payWindow? Is it meant to be a signal as to the work that B may also be sending to other Os? Does this also depend on knowing the approximate EV for other Os?

The thought is to have payWindow be the expected time period starting from when the double spend occurs (i.e. as soon as there is > 1 ticket that occupies the remaining funds in a broadcaster's deposit) until the time at which the double spend is detected (i.e. the orchestrator realizes that the broadcaster does not have sufficient funds). If orchestrators are redeeming winning tickets immediately or relatively soon after receiving them, then this time period might be around 2-3 blocks.

Is the entire escrow slashed (burned or rewarded?) if a double spend happens? Do the Os get any part of the escrow?

The escrow is currently burned, but it could also be sent to a community pool, but without a formal governance procedure for handling those funds, burning seems simpler. Sending the escrow to orchestrators might introduce an incentive to get broadcasters to accidentally double spend which I am wary of, but perhaps worth further exploration.

How frequently would re-commitments happen?

A broadcaster could re-evaluate its recipient set once per round (could be less frequent?) and if there are any necessary additions/replacements based on changes in the registered orchestrators, it can update its recipientCommitment. This update might require an increase in the broadcaster's penalty escrow. Top offs for the penalty escrow could be instant, but a penalty escrow withdrawal should still be subject to a delay to prevent a broadcaster from racing to reducing its penalty after double spending.

"Double spends" may not be intentional by the B; it could just be a broadcaster's deposit going to zero as several Os try to claim (perhaps those that ignore maxPayRate as @ericxtang suggested).

I think the safest approach for a broadcaster is to not "reuse" deposit funds that are currently occupied by a ticket until it either expires or the orchestrator proves that the ticket did not win. Shorter ticket expiration times would likely be necessary to allow for this. Once the broadcaster knows that the ticket expired or that it did not win, the broadcaster can unblock those funds for further ticket creation.

B would have to regularly post on-chain updates as the network changes (eg, Os raise their maxPay, new Os come online, Os go offline, etc). This "maintenance" increases friction in a few ways: B needs to determine which Os to work with and pay the gas to post the commitment.

The burden on a broadcaster to update recipientCommitment could be reduced with the following update:

Let's assume that the broadcaster has fetched maxPayRate values from orchestrators. Upon receiving a valid Merkle authentication path for a broadcaster's recipientCommitment, an orchestrator will rate limit payments from the broadcaster based upon the leaf value (recall that a leaf is (ethAddr, value)) rather than its current maxPayRate value. As a result, even if the orchestrator has updated its maxPayRate value, a broadcaster is still able to transact with the orchestrator.

The broadcaster would still need to post recipientCommitment updates (and top off its penalty escrow accordingly) if it wants to increase its payment throughput with certain orchestrators (that have increased their own maxPayRate values) or add/remove orchestrators from the recipient set. This is definitely not ideal, but I haven't come up with a better solution yet.

Feels like Os would be conservative about setting maxPay (since they can always reduce it, but increasing it is harder.) However, the price volatility of Ethereum may make it difficult to "set and forget" a maxPay value.

Perhaps the price volatility issue could be alleviated if the payments were denominated in a stablecoin like Dai.

I wonder if we could reframe the escrow as a "reserve" that Os can draw back on. Under this scenario, as B's deposit falls below the reserve, Os are guaranteed to receive at most maxPay, perhaps by submitting a winning ticket along with B's Merkle proof. This makes it less punitive for Bs, accommodating to over-drawing (whether intentional or not), and essentially eliminates O's risk of non-payment up to maxPay.

This is interesting - the penalty escrow ends up being split into collateral allocations for each committed recipient. If by "double dipping" you mean after a double spend (i.e. broadcaster's deposit is 0) a recipient trying to claim its collateral portion more than once, I think the contract can just keep track of "dips". Simple solutions for resetting include having the broadcaster using a new account or keeping track of a separate penalty escrow for a account that has been slashed once.

I think the safest approach for a broadcaster is to not "reuse" deposit funds that are currently occupied by a ticket until it either expires

Since payments are probabilistic, B would have to significantly over-collaterialize its deposit. This is on top of the penalty escrow. While the actual payments themselves may be low, having to lock up an excess of funds to guarantee the payment is less attractive. This may work better for a chequebook-style system with exact accounting.

the orchestrator proves that the ticket did not win

Do we have a way to do that?

The thought is to have payWindow be the expected time period starting from when the double spend occurs (i.e. as soon as there is > 1 ticket that occupies the remaining funds in a broadcaster's deposit)

Thanks for the explanation. Still not sure that I understand the payWindow as a risk mitigation measure. From the O's perspective, its primary interest is in ensuring it gets paid. The penalty escrow gives B an incentive to closely monitor its deposit balance, but doesn't actually address O's non-payment risk. Notably, O shouldn't wait for a double spend at payWindow increments to terminate the stream, because at that point it's too late: the O is not getting paid for the work done.

VOD is an example where EV would grow instantaneously, eg, by pre-generating enough tickets to cover the job non-interactively. As long as there is a reasonable margin of safety between the outstanding EV relative to B's current deposit [1], then O should be able to process the job with some confidence that it'd get paid, rather than artificially throttling the stream by waiting for a double-spend (which only seems likely as B's deposit runs low, but O's margin-of-safety should guard against that, as will the penalty escrow).

[1] Margin of safety could look like: B.accumulatedEv < B.deposit / factor , where factor is an approximation of the other Os, say 2^n for B's Merkle proof of length n, assuming B's commitment derives from a full binary tree. This goes back to the notion of B over-collateralizing its own deposit, although 2^n is perhaps extreme. This also indicates that the margin of safety should be some multiple of the expected faceValue although I'm not sure what an ideal relationship would be.

If by "double dipping" you mean after a double spend (i.e. broadcaster's deposit is 0) a recipient trying to claim its collateral portion more than once

Yes, although in this case I wouldn't call it a double spend; it's just the normal course of drawing down the reserve. We also don't have to take all of the collateral each time O submits a ticket; the collateral could be debited based on the faceValue of winning tickets (even several times), as long as there is enough collateral remaining for that O. In that sense, the reserve is just an extension of the normal flow for scenarios where B might be running low on funds. O knows exactly how much credit it can extend towards B.

a separate penalty escrow for a account that has been slashed once

That's probably the best way to top off reserves that have been drawn down, or if the commitment changes. There probably also needs to be a "withdraw lock" period (the length of the ticket expiry) once B wants to free up those funds, whether the reserve has been drawn or not.

We'll probably need to be careful about updates to B's commitments, perhaps going as far as creating a new reserve account for each new commitment, and withdraw-locking the old one for the ticket expiration period. This is a major drawback.

The penalty escrow doesn't really need a new account for each commitment, but there are concerns with, say, committing to a zero-balance escrow and spending your own deposit to near-zero, while you have a transcoding job going on. Will probably still want some withdraw locking period around the escrow.

Note: I'll refer to whether a ticket won or not as the lottery result.

Since payments are probabilistic, B would have to significantly over-collaterialize its deposit. This is on top of the penalty escrow. While the actual payments themselves may be low, having to lock up an excess of funds to guarantee the payment is less attractive.

Agreed that the additional capital requirement that results from blocking off funds for tickets until they expire or the lottery result is known. The fund blocking duration for a broadcaster might be reduced if honest orchestrators that send notification messages of winning tickets immediately and short ticket expirations are used to defend against byzantine orchestrators that do not send notification messages.

Do we have a way to do that?

The orchestrator can send the pre-image (recipientRand) for the random number hashed commitment (recipientRandHash) of a ticket to the broadcaster who can then check if the ticket won.

Still not sure that I understand the payWindow as a risk mitigation measure...Notably, O shouldn't wait for a double spend at payWindow increments to terminate the stream, because at that point it's too late: the O is not getting paid for the work done.

O doesn't wait for a double spend at payWindow. When O receives a ticket it knows that the ticket could be a part of a double spend so it starts a payWindow. During this time period, any winning tickets received may or may not successfully pay out because O might not realize that B does not have sufficient funds until the end of payWindow. Thus, O's maxPayRate represents the maximum value that it is willing to put at risk during this period.

maxPayRate can be calculated as:

Maximum sum of winning ticket face values
Maximum sum of ticket EVs

The former is the safest in the worst case because even if a malicious B is very lucky and sends out consecutive winning tickets, B's escrow will still cover O's maxPayRate. However, this increases the escrow requirements for B. The latter reduces B's escrow requirements, but if a malicious B is very luck and sends out consecutive winning tickets, then B's escrow will not necessarily cover O's maxPayRate.

As long as there is a reasonable margin of safety between the outstanding EV relative to B's current deposit [1], then O should be able to process the job with some confidence that it'd get paid

As you mentioned, the margin of safety approach imposes additional capital requirements on B since it needs to maintain a padded deposit which is a similar result to what happens when B blocks off funds for tickets until it knows a ticket expired or the lottery result.

Another way to think about this is that if B's escrow >= the maximum additional utility it can gain from double spending (until it is caught), then O can make an assumption that B is a rational actor that will make sure it doesn't purposefully or accidentally double spend (perhaps by blocking off funds for tickets) and not watch B's deposit at all. If we reward O will its maxPayRate portion of B's escrow, then O also knows that it will also be compensated for that amount of damages if a double spend does occur.

Yes, although in this case I wouldn't call it a double spend; it's just the normal course of drawing down the reserve.

Just for clarification by double spend I refer to when B has sent out > 1 outstanding winning ticket s.t. it has insufficient deposit funds to cover all the tickets and by double dip I refer to when B has been slashed and an O tries to claim its portion of B's escrow more than once. The latter can be mitigated by having the contract keep track of claims.

We also don't have to take all of the collateral each time O submits a ticket; the collateral could be debited based on the faceValue of winning tickets (even several times), as long as there is enough collateral remaining for that O.

Yeah I think that could work - O only claims from B's escrow up to the the sum of face values of winning tickets that are not redeemable using B's deposit.

We'll probably need to be careful about updates to B's commitments, perhaps going as far as creating a new reserve account for each new commitment, and withdraw-locking the old one for the ticket expiration period. This is a major drawback.

If we do not reward Os with portions of B's escrow, the restrictions around escrows + commitments might be lessened a bit - B only needs to wait through a lock period if it wants to withdraw a portion of its escrow since its current recipient set results in a smaller escrow requirement - if B modifies its current recipient set resulting in a higher escrow requirement then it can do an instant top off.

If we do reward Os with portions of B's escrow then B might have to wait through a lock period whenever it modifies the recipient set. Otherwise there could be the following race condition:

B commits to a set of Os
B double spends
B commits to a new set of Os
The old set of Os detect the double spend
The old set of Os cannot claim B's escrow portions

withdraw-locking the old one for the ticket expiration period.

There is already a unlocking period for withdrawing B's escrow which is probably preferable to use here if tickets have short expiration times. B will be locked in to a set of Os for a longer period of time so that Os have time to claim escrow portions if a double spend does occur.

committing to a zero-balance escrow and spending your own deposit to near-zero, while you have a transcoding job going on.

Zero escrow shouldn't be too big of a concern if Os only work with Bs that prove that their escrow covers at least O's maxPayRate and if there is a withdraw delay preventing escrow from being withdrawn prematurely.

This is a major drawback.

Yeah the dynamics around committing to recipient sets are suboptimal at the moment and there is more work to be done...

Note: I'm using the term "reserve" to denote escrow that is paid out to Os if their deposit dips below their committed level. If it's burned instead, then I call it a penalty escrow or just "escrow".

The orchestrator can send the pre-image (recipientRand)

That would work after a stream ends, as a sort of "follow up" if B asks for it. This might require some more bookkeeping beyond winning tickets on both sides though.

During this time period, any winning tickets received may or may not successfully pay out because O might not realize that B does not have sufficient funds until the end of payWindow. Thus, O's maxPayRate represents the maximum value that it is willing to put at risk during this period.

O's risk doesn't actually decrease as payWindow passes; it only decreases as winning tickets are submitted and confirmed on-chain. Note that I'm going by this statement in the original writeup:

An orchestrator will maintain a local counter to track the value of tickets received which is reset at the beginning of each payWindow.

O should definitely have a "maximum float" (whether EV or actual winning tickets) that's some multiple of the faceValue, or a fraction the deposit balance. Does it make a difference whether that "maximum float" is hit instantaneously, within a very short payWindow, or over a longer period of time such as the ticket expiration period? (Distinguishing between "maximum float" and maxPayRate here, because the latter's at-risk amount is a function of payWindow while the former is an absolute at-risk amount.)

Yes, although in this case I wouldn't call it a double spend; it's just the normal course of drawing down the reserve. Just for clarification by double spend I refer to when B has sent out > 1 outstanding winning ticket s.t. it has insufficient deposit funds to cover all the tickets

Yes, just wanted to reframe things a bit in that drawing down a "reserve" could be considered as part of the normal, non-punitive flow: everyone gets exactly what they're entitled to and they go home happy. It's a more positive connotation as opposed to a double-spend triggering a penalty escrow that's burned, which leaves everyone involved unhappy.

That would work after a stream ends, as a sort of "follow up" if B asks for it. This might require some more bookkeeping beyond winning tickets on both sides though.

Why would this only work after a stream ends? An O can reveal recipientRand to B as soon as O receives a ticket.

Here is a rough quick pass sketch of the additional book keeping:

For each ticket T that B sends out, it can add H(T) and the expiration time for T to a set. For each ticket received, O sends B T and recipientRand. B checks that T corresponds to an existing entry in its set. If the entry does not exist, abort. If the entry exists, B checks that H(recipientRand) == T.recipientRandHash. If recipientRand is a not a valid pre-image, abort. If recipientRand is a valid pre-image, remove H(T) from B's set. B can run a background loop to remove any H(T) entries in its set that have expired.

O's risk doesn't actually decrease as payWindow passes; it only decreases as winning tickets are submitted and confirmed on-chain.

What if Os are redeeming winning tickets immediately upon receipt (if frequency of on-chain transactions and associated gas price exposure are a worry, parameters can be tweaked so that tickets win less often)?

Perhaps another way to frame this is:

O has a payment rate limit per second that can be derived from factors such as the max PPP it could charge (based on all its offered renditions), max number of pixels per second, max number of concurrent streams, etc. O can also have a payment rate limit per payWindow. B is not allowed to exceed any of these payment rate limits.

Let's say that O divides time into time periods with the beginning of each period denoted by t1, t2, t3, etc. O's payment rate limit per second is 1 and payment rate limit per payWindow is 50.

Suppose B double spends at t1. O will accept at most 50 from B until t2. By t2, O will either detect the double spend because multiple winning tickets were submitted by all Os (potentially including itself) draining B's deposit or O no double spend will be detected yet (because not all the outstanding winning tickets were redeemed) but O will still be compensated for its redeemed winning tickets.

Suppose B double spends between t1 and t2 and can detect the double spend between t2 and t3. O will accept at most 50 from B until t2 and O will only accept 1 per second. So, if there are 10 seconds left until t2, O will only accept <= 10 (the acceptable amount might be less because if B already paid > 40 earlier in payWindow then the actual acceptable amount will be less than 10). Similarly, if there are 10 seconds from t2 until the double spend detection time, then O will only accept 10.

O should definitely have a "maximum float" (whether EV or actual winning tickets) that's some multiple of the faceValue, or a fraction the deposit balance. Does it make a difference whether that "maximum float" is hit instantaneously, within a very short payWindow, or over a longer period of time such as the ticket expiration period?

Could you outline an example scenario with O keeping track of an absolute maximum float?

Why would this only work after a stream ends? An O can reveal recipientRand to B as soon as O receives a ticket.

Assuming the winning ticket check remains this: H(Sig_B(H(ticket)) | recipientRand) < ticket.winProb then B would be able to determine which nonces produce a winning ticket given knowledge of recipientRand. Then B could skip winning nonces, or acquire a new recipientRandHash prior to winning, or switch Os.

The networking protocol would have to move in lockstep: each time B sends a ticket, it cannot proceed further until it receives a reply from O with the recipientRand and a new recipientRandHash . We wouldn't be able to have more than one segment in-flight for non-VOD use cases. This doesn't seem good for a live-streaming use case with tight delay tolerances.

Note that if we did this, we probably could simplify the ticket construction, as a "sender nonce" would no longer be required, but O would be required to keep track of all its tickets up until expiration. recipientRand has a stricter requirement for randomness than senderNonce which only needs to be unique.

What if Os are redeeming winning tickets immediately upon receipt (if frequency of on-chain transactions and associated gas price exposure are a worry, parameters can be tweaked so that tickets win less often)?

If we stipulate that Os should redeem winning tickets immediately, then either B needs to carry a higher at-risk amount and corresponding deposit collateral (lower winProb, higher faceValue) or the faceValue alone increases to cover immediate transaction costs. Given that an orchestrator cannot predict the gas environment, the latter seems more likely. With gas being a higher percentage of overhead, ticket faceValues fluctuate more based on the gas environment and externalizes the cost of network transactions to the broadcaster.

Given that the motivation for PM was to smooth out Ethereum network unpredictability and reduce transaction overhead, this seems to be running against our initial goals with PM. Retaining the flexibility to submit tickets at will (before expiration) seems like an important property.

Could you outline an example scenario with O keeping track of an absolute maximum float?

If the faceValue is N, the maximum float might be 3N. The maximum float should be some multiple of the typical faceValue ; otherwise stream processing would deadlock every time EV > faceValue (eg, a winning ticket is a bit late coming in), or stream processing would halt each time a winning ticket is received (until confirmed on chain).

If using a reserve based payments system, then this maximum float could be the amount of the Merkle commitment from the broadcaster towards the O. At this point, with a reserve based system, not much else really needs to be done. With a maximum float of 3N, the orchestrator is guaranteed [1] a payout for at least three winning tickets of faceValue. O can continue to process a stream until receiving three winning tickets [2]. These tickets can come in as quickly or as slowly as B needs; O will simply stop processing the stream if its risk threshold has been exceeded.

If a penalty escrow system is used, then O could define another threshold, the "minimum collateral" for the entire deposit. This might be something like 10N. O stops accepting tickets from B under two conditions: if B's deposit (without escrow) dips below the minimum collateral, or B's outstanding balance exceeds the maximum float.

A form of minimum collateral could still be used by O for reserve payments if it wishes to have the maximum float be greater than its Merkle commitment as reserved by B. Eg, Merkle commitment is 3N, O wishes to float B up to 6N, then the minimum collateral (excluding the reserve) could be 10N.

This also plays better with VOD where B submits a large amount of tickets up front. O can determine in advance whether B is has enough to process the video via the maximum float. Throttling payments via windowing doesn't really fit in with the ideal VOD workflow.

The more we discuss this, the more I am leaning towards the reserve based system for a few reasons:

Less punitive; the optics seem nicer than slash-and-burn.
Less at risk with guaranteed payouts. No requirement for windowing.
Eliminates potential attack vectors / griefing around ticket withholding.
Fits the VOD workflow better.
Ticket redemption can happen at any time before the ticket expires. Makes ticket submission less urgent as long as the ticket values <= the Merkle commitment given for an O.

Even a reserve based system feels like a local optimum due to the burden of the Merkle commitment. Hope we can find something a little more flexible.

[1] We probably need to ensure that negative balances in the Merkle tree are not allowed, or have an invariant that the parent cannot be less than a leaf. Eg, the right side of a tree has a balance of 50, and the left side has a balance of 10, but the left actually has two leaves, one with a balance of 100 and another of -90.

[2] Or EV exceeds maximum float. But if that happens, there's a deadlock; either the float is too low or winProb needs to be increased (with a corresponding decrease in faceValue).

The networking protocol would have to move in lockstep: each time B sends a ticket, it cannot proceed further until it receives a reply from O with the recipientRand and a new recipientRandHash .

Ok yeah I see how revealing recipientRand after each ticket is problematic due to the burden of increased interactivity and its impact on latency sensitive workflows.

We could mitigate this issue by using a non-interactive winning ticket selection protocol based on deterministic signatures or VRFs as described here since the broadcaster would not be blocked from sending additional tickets while an orchestrator is revealing its random output. As you mention further down, this would have positive implications for a VOD use case as well. We should discuss whether it is worthwhile to pursue this area of investigation now as opposed to leaving it as a post-Streamflow concern.

livepeer / research

PM double spend security #15