livepeer / protocol

Livepeer protocol
MIT License
152 stars 45 forks source link

Accounting V2: A Delegator will be eligible from rewards after staking to an orchestrator in the current round already #463

Closed kyriediculous closed 3 years ago

kyriediculous commented 3 years ago

In streamflow a delegator will only be eligible to earn rewards and fees after staking to an orchestrator the round after the stake was allocated.

In the share-based accounting upgrade a delegator will instantly be eligibly for rewards and fees after staking to an orchestrator.

This change in behaviour is due to the removal of tracking earnings pools on a round-by-round basis. This greatly reduces gas costs because no new pools need to be instantiated in storage, rather just mutated. Furthermore it also avoids unbound state expansion which is the biggest bottleneck in scaling the underlying blockchain layer, because of this there have been proposals to e.g. introduce state rent.

It is important to note that in the currrent implementation it is not possible to move stake from an existing delegation without first going through the unbonding period so any security concerns when it comes to moving stake around to front-run reward calls and winning ticket redemptions from different O's is currently out of the question. Whether we should allow this (like it was in streamflow) and in what way (e.g. introduce a cooldown) is part of a separate discussion, see https://github.com/livepeer/protocol/issues/464.

Given the above or a potential solution for it I think the change in mechanics is okay.

yondonfu commented 3 years ago

The main downside I currently see with instant eligibility for rewards and fees in the same round that a delegation position is created is that older, existing delegation positions prior to the current round accrue the same amount of value as delegation positions created in the current round even though the tokens for the latter are staked for a shorter period of time. The impact of this property is reduced when reward calls occur once per round since the value captured by delegation positions created in the current round is lower. However, the impact of this property can be increased if reward calls occurred once every N rounds (i.e. the reward period from LIP-56) since each reward call mints more tokens so delegation positions created in the current round can capture more value. This downside may be manageable if reward calls always occur once per round, but if reward calls ever occur less frequently (and thus mint more tokens) then I think this downside will be more problematic.

In the interest of considering whether there is a workable solution that eliminates the above downside altogether:

Are there implementation approaches where delegation positions are made eligible for rewards and fees starting from the round after they are created that don't require per round state tracking?

kyriediculous commented 3 years ago

Are there implementation approaches where delegation positions are made eligible for rewards and fees starting from the round after they are created that don't require per round state tracking?

I don't believe there are because the underlying technical change is essentially a change in how we snapshot data (on a per-round basis vs based on the current block) i.e. we don't store state mutations for a moment in the future but for the present.

The main downside I currently see with instant eligibility for rewards and fees in the same round that a delegation position is created is that older, existing delegation positions prior to the current round accrue the same amount of value as delegation positions created in the current round even though the tokens for the latter are staked for a shorter period of time. The impact of this property is reduced when reward calls occur once per round since the value captured by delegation positions created in the current round is lower. However, the impact of this property can be increased if reward calls occurred once every N rounds (i.e. the reward period from LIP-56) since each reward call mints more tokens so delegation positions created in the current round can capture more value. This downside may be manageable if reward calls always occur once per round, but if reward calls ever occur less frequently (and thus mint more tokens) then I think this downside will be more problematic.

Yeah that's an entirely correct assessment that can be summarised as "the longer the reward accruation period the more impact on instant reward eligibility".

There are several ways to address this and I believe the complexity of that can vary based on the impact.

It's also worth keeping in mind that the harsher the restrictions that are imposed (e.g. a high delegation tax) the more friction is also caused in the delegation market.

kyriediculous commented 3 years ago

I think there might be an implementation level solution but it's not necessarily clean.

Based on the amount of shares in a delegation pool, some orchestrator params and the amount of rewards that orchestrator should earn for the round (which we can calculate) we can calculate a deduction amount for the LPT rewards that would be earned in case the orchestrator being delegated to hasn't called reward yet for the round, that should be deducted when the delegator updates its delegation.

Edit: but then again if the orchestrator doesn't call reward for the round the deduction shouldn't be upheld and we can only process that by subsequent interactions as well which would likely involve tracking some more state than I'm currently anticipating..

kyriediculous commented 3 years ago

I Spent some more time thinking about this problem today.

To begin I'd like to divide this issue up into two separate ones:

I think this might be fine and could actually reduce friction of new entrants into the delegation markets, assets are instantly productive. Since this entails a new position there is also no possibility of earning double rewards for a round and the impact is limited to a single round.

Having no restrictions on this is clearly problematic because a delegator could do this multiple times in a round and front-run every reward call being made to earn rewards from all orchestrators every round. As discussed we should have restrictions around this.

One suggestion made was a cooldown period, however with this mechanic when the cooldown period ends a delegator can still earn double rewards for a single round every time the cooldown period ends. So this might not be a suitable solution.

Another solution offered by graph protocol is a delegation tax which is burnt (0.50% on the principle + rewards when moving and 0.50% on a new allocation). Which is rather harsh especially since it's also imposed on new allocations. Furthermore is creates friction in the delegation market which we wanted to avoid.

However the idea of a burn mechanism is quite interesting given that the protocol economics right now are purely inflationary without any deflation mechanic apart from governance polls being created. So what if there is a middle ground possible.

We know what the harmful case is , translated in more technical terms this would be orchestratorOld.lastRewardRound == currentRound && orchestratorNew.lastRewardRound < currentRound. In this case there is the potential that double rewards could be earned.

Now there's a couple of things we can do given that this assertion could be easily made in the code:

yondonfu commented 3 years ago

Note: I'll revisit the ideas presented in the previous comment separately, but I wanted to first dump some ideas I've been thinking about.

I've been thinking about whether incorporating time as an input into reward per staked token calculation could help address the double reward problem. Based on the latest discussion, the double reward problem occurs when the same stake earns rewards as a part of multiple delegations to multiple orchestrators in a single round. What if the rewards from each of the delegations depend on how long the stake was a part of the delegation? Then, if stake is moved between different delegations in a single round even though the stake would earn rewards for multiple delegations, the reward amount for each delegation would be significantly reduced if the stake was a part of the delegation for only a short period of time. This time based reward per staked token calculation only reduces the value that can be captured by a delegator in a double reward scenario, but the ability for a delegator to rack up delegations with the same stake in a single round is still an issue. To address this issue, an instant re-delegation cooldown period could be used to restrict a delegator to a single instant re-delegation every N rounds (N = 1 or N = 2 may be reasonable). As a result, a delegator can re-allocate stake from its existing delegation to another delegation and both delegations will be able to earn rewards, but only proportional to the time that the stake is allocated to each of the delegations. If the stake is allocated to one delegation for 30% of the round and allocated to the second delegation for 70% of the round, the first delegation would earn 30% of the eligible rewards for its orchestrator and the second delegation would earn 70% of the eligible rewards for its orchestrator.

Implementation Thoughts

Can the time based reward per staked token calculation be implemented in a gas efficient way? I'm not 100% certain yet, but I think it is possible.

I've been looking at the accumulator algorithm that is used in various contracts forked/based off of an early SNX liquidity mining contract. The mathematical form of the accumulator algorithm is described in this blog post and implementations of the algorithm can be found in a few contracts including Synthetix's StakingRewards and Uniswap V2's StakingRewards.

The formulas in the linked blog post are for a single time period that rewards should be distributed over to LPs based on a) time that liquidity is provided for and b) the LP's liquidity relative to the total liquidity in the contract. In a single time period the reward amount is assumed to be constant. This description actually also applies to Livepeer staking in a single round for a single orchestrator! The reward amount for the orchestrator is constant during the round. At the moment, the rewards are only distributed to delegators based on the delegator's stake relative to the total delegated stake, but could also be based on the time that stake is delegated for [1]. The linked accumulator algorithm implementations can also handle new rewards being added to the contract by re-calculating the reward rate for the new reward period. In the case of Livepeer staking, the logic of the notifyRewardAmount function would be executed whenever an orchestrator calls reward.

[1] May want to have a single time slice be a block as opposed to a second as is the case with liquidity mining algorithms.

0xVires commented 3 years ago

Are there implementation approaches where delegation positions are made eligible for rewards and fees starting from the round after they are created that don't require per round state tracking?

Not sure about the state tracking, but Tokemak is currently using cycle based farming: https://medium.com/tokemak/toke-rewards-cycles-reminder-6c00c0380297

This would solve the problem regarding earning double rewards - but I'm not technical enough to assert if its applicable here or how gas efficient it is :)

Github: https://github.com/Tokemak/tokemak-smart-contracts-public Rewards Contract: https://etherscan.io/address/0x79dd22579112d8a5f7347c5ed7e609e60da713c5#code

yondonfu commented 3 years ago

Thanks for sharing @0xVires! Will take a look.

yondonfu commented 3 years ago

A quick refresher - the double reward problem can be broken down into two sub-problems:

  1. A delegation is eligible for rewards immediately in the round that it is created
  2. Stake can be re-delegated to multiple orchestrators in the same round and earn rewards from each orchestrator

These two sub-problems combined result in the double reward problem. In the interest of keeping this comment from getting too long, I'll comment on the first sub-problem here and the second sub-problem in https://github.com/livepeer/protocol/issues/464.

Suppose the second sub-problem is solved. Would it still be a problem if a delegation is eligible rewards in the same round that it is created? I think so.

At the moment, the reward amount for an orchestrator is calculated based on the orchestrator's stake relative to the total stake of all orchestrators as of the beginning of the current round. If a delegation is created in the current round, the reward amount for the orchestrator would already be locked in so every other delegation that existed at the start of the round would end up earning less rewards. For example, suppose a reward of 100 LPT is to be split between two equal stake delegation. If another 50 LPT delegation is created during the round, the previous delegations would earn 33 instead of 50. This example is not possible with the mainnet contracts because they do not allow delegations to be eligible for rewards in the round they are created.

Calculate reward amount based on stake at time of reward call

If the reward amount for an orchestrator is calculated based on stake at the time of a reward call, then the example mentioned above would no longer be possible even if delegations are eligible for rewards in the round they are created. This is because the delegation would increase the stake of the orchestrator and thus the reward amount that the orchestrator mints. So, instead of the new delegation taking rewards from existing delegations, it actually expands the available rewards first and then earns a share of the larger amount.

This could be implemented by moving the logic for calculating the inflation rate and the mintable tokens into the reward call - this logic currently lives in round initialization. When the Minter is invoked to create the reward it could execute this logic:

contract Minter {
    function createReward(uint256 _stake, uint256 _totalStake) external {
        setInflation();

        uint256 mintable = inflationRate * livepeerToken().totalSupply();
        uint256 rewardAmount = mintable * _stake / _totalStake;

        return rewardAmount;
    }
}

The side effect of this change would be that inflation rate and mintable tokens would not be fixed for a round since any staking activity in the round could change both of these values.

kautukkundan commented 3 years ago

I thought of another approach for calculating and distributing rewards. This approach has a change in the mechanism of how rewards are called. 2 new variables are introduced in each delegation pool - prevRoundRewards, claimedPrevRoundRewards + orchestratorCutForTheRound (which can be read from the staking manager).

[⛔️ After writing this I realized that I completely missed that there is a need to track if individual delegator has claimed reward, still keeping this comment here for reference]

step 1 - instead of each orchestrator calling reward individually, rewards for all the orchestrators will be called at the same time when initializeRound is called [1] end of round [this comment] (https://github.com/livepeer/protocol/issues/463#issuecomment-922581787).

step 2 - Now instead of calculating cuts for orchestrator and delegators individually, the total value of reward is simply added to the respective pool and the prevRoundRewards is overwritten with the value and claimedPrevRoundRewards is set to 0 [2]. However, Before the overwrite the unclaimed rewards from previous round is calculated as prevRoundRewards - claimedPrevRoundRewards which can be used as per [3]

step 3 - Now, each delegator and orchestrator can individually call a claimReward function which will use the current shares, prevRoundRewards and orchestratorCutForTheRound to calculate respective rewards [4]. delegator rewards = prevRoundRewards * (1-orchestratorCutForTheRound) * delegator's share orchestrator rewards = prevRoundRewards * orchestratorCutForTheRound * orchestrator's share

then the delegator's and orchestrator's stakes are updated.

step 4 - for each claim, update claimedPrevRoundRewards

notes -

  1. Since there are a fixed number of orchestrators that is 100, calling reward for all the orchestrators will be a constant time function
  2. due to overwrite there won't be an unbonded growth for the storage
  3. The unclaimed rewards can be used for a variety of purposed. eg - can be given to orchestrator, can be used for governance etc. A part of this can even be used to reward the account which calls initializeRound as it now has a greater gas usage.
  4. This also solves the problem where delegators are dependant upon orchestrators for calling reward, now they don't lose reward even if the orchestrator doesn't call reward. Also, if the user does not call reward, they lose the reward, which is similar to the current approach.

additional notes -

  1. The first step does increase gas usage as the rewards are calculated for each orchestrator but for individual users the gas will be much lower
  2. This solves the problem of instant eligibility, dilution before reward call and also dilution after reward call
kautukkundan commented 3 years ago

possible fix to the problem in my previous comment add a lastClaimedRound variable in the delegation struct and check it before calling claimReward https://github.com/livepeer/protocol/blob/next/contracts/bonding/Delegations.sol#L28-L31

yondonfu commented 3 years ago

The latest accounting design that should address this issue and #464 can be found in this ObservableHQ notebook https://observablehq.com/@yondonfu/shares-per-round-earnings-accumulators-based-delegation-p. There may be some additional simplifications possible. The design does use per round state tracking similar to LIP-36, but it does include a few improvements from a code complexity & gas usage POV. A follow up writeup to come.

@kautukkundan Moved the conversation for https://github.com/livepeer/protocol/issues/463#issuecomment-927912360 to https://github.com/livepeer/protocol/issues/482