liquity / dev

Liquity monorepo containing the contracts, SDK and Dev UI frontend.
GNU General Public License v3.0
322 stars 312 forks source link

Pending oracle decisions #201

Closed bingen closed 3 years ago

bingen commented 3 years ago

It seem clear we go with Chainlink, but there are remaining related issues:

| Use case        | Only Chainlink | Chailink + fallback | Median | Min - max |
|-----------------|----------------|---------------------|--------|-----------|
| Redpemptions    |                |                     |        | ?         |
| Everything else |                | ?                   |        |           |
cvalkan commented 3 years ago

We may want to add this to the list: https://github.com/Keydonix/uniswap-oracle/. It allows to access historic Uniswap TWAP without needing to track them.

cvalkan commented 3 years ago

It turns out that the NEST oracle is charging a fee on each request:

Screenshot 2020-12-27 at 14 18 54

Also, the contract address of the oracle is subject to change:

Screenshot 2020-12-27 at 14 21 02

See the CoFiX project for more information on using the NEST protocol: https://github.com/Computable-Finance/CoFiX/blob/a7228ef8d17fc69228df27f340338a404a4a0208/docs/how_to_integrate_cofix.md

I'm not sure if it's really suitable for our purposes if the fee can change too.

cvalkan commented 3 years ago
First Header Pros Cons
Uniswap TWAP (snapshotting upon every operation) Straightforward. Used by other projects and suggested by Robert Leshner Price outdated if Liquity is not used frequently. Extra gas for every operation. Liquity becomes dependent on USDC/Tether/DAI.
Uniswap TWAP (Keydonix) Convenient and fast Uses unaudited low-level code. Liquity becomes dependent on USDC/Tether/DAI.
Compound Open Oracle Decentralized and permissionless in theory. Uses Uniswap prices as bounds (sanity check) Update frequency relatively slow and unclear
Coinbase signed price feed Very fast Centralized and prone to manipulation
Tellor Simple to use. No dependency on other stablecoins Griefing attack issues. Only 5 price feeds in one update transaction.
NEST Fast. Quite popular in China. May be more secure than Uniswap? Contract calls are not free, but may cost 0.01 ETH per call. The pricing may change and is not clearly documented. Liquity becomes dependent on USDC/Tether/DAI
RickGriff commented 3 years ago

In terms of Chainlink failure criteria, some loose bounds could be:

Ways Chainlink could fail:

cvalkan commented 3 years ago

Fallback conditions

The fallback conditions are the conditions that need to be fulfilled in order to fall back to another oracle solution.

To minimize our dependency on the fallback oracle, we should prevent potential fallback oracle failures from having an impact on the system as long as the main oracle (i.e. Chainlink) is working properly. Otherwise, the sum of the two could be less secure than just using Chainlink. That means, we have to determine the fallback conditions without taking the output of the fallback oracle into account. This rules out a logic which would check for a deviation between the main and the fallback oracle.

Obvious failures (invalid timestamp, reported price negative)

require(timeStamp > 0 && timeStamp <= block.timestamp, "PriceFeed: price timestamp from aggregator is 0, or in future");
require(priceAnswer >= 0, "PriceFeed: price answer from aggregator is negative");

No update for x hours

Chainlink has changed its update logic multiple times from regularly updating the price every 5, 10, 30 minutes to updating it upon every 0.5% change plus a heartbeat period of 3 hours (which used to be 1 hour in the beginning?).

Past outages / failures

On Black Thursday, March 12, 2020, the oracle had an outage where the price was only update once during a period of almost 6 hours despite heavy price fluctuations.

For the raw data, see this Google Sheet.

Screenshot 2021-01-18 at 13 40 28

There was another outage on July 17, 2020, for no apparent reason:

Screenshot 2021-01-18 at 13 39 55

The 14+ hours delay on April 9, 2020, seems to be an artefact of data scraping rather than an outage.

Screenshot 2021-01-18 at 13 42 01

The following chart depicts the update frequency from March 1, 2020 to January 14, 2021.

Screenshot 2021-01-18 at 12 56 03

It seems that the chart is hiding quite some information, as the following chart shows:

Screenshot 2021-01-18 at 13 33 54

For some reason, Chainlink had a much higher average update frequency in the first 14 days of 2021 than ever before, despite the fact that the heartbeat period has been decreased multiple times.

Relative price change

The following chart shows the relative oracle price change over the course of a year:

Screenshot 2021-01-14 at 11 50 10 (1)

This chart shows the relative oracle price change for the same period as the charts on the update frequency above:

Screenshot 2021-01-18 at 12 55 32

On Aug 2, 2020, there was sharp >10% price drop of Ether according to Coinmarketcap and CoinGecko:

Screenshot 2021-01-18 at 11 41 17

The price drop is reflected (somewhat amplified) in the Chainlink oracle price:

Screenshot 2021-01-18 at 11 40 45

Conclusion

Assuming that Chainlink's accuracy/sensitivity and its actual update frequency won't decrease over time, we could use the following thresholds as fallback conditions:

Choice of our fallback oracle

Compound Open Oracle

https://github.com/compound-finance/open-oracle/tree/master/contracts https://blog.openzeppelin.com/compound-open-price-feed-uniswap-integration-audit/

Pros:

Cons:

cvalkan commented 3 years ago

Sanity check on the fallback oracle

As @bingen bingen mentioned, it probably makes sense to also do a sanity check on the price reported by the fallback oracle before using it. Besides formal checks like (price not negative) and recency (last updated < x hours), we could also do proper value check. Since we don't know the previous price of the fallback oracle (assuming it doesn't provide access to historic data), we could only do this check against the main oracle price though. Here the logic could be that we simply use the price feed that is closer to the last price stored in PriceFeed.sol. So, if the fallback price feed is even farther off, we would use the Chainlink price.

Returning to the main oracle

Another question related to the fallback conditions is how the system can get back to its normal state and continue using Chainlink's ETH:USD price. While this is straightforward for formal failure cases like bad timestamps, negative prices and outages of x hours (we can simply continue when the new data is available and formally correct), it is not so simple for the failure case of an unrealistic price deviation (say >15%).

Here, we can't simply go back when the deviation between the last Chainlink price and the current Chainlink price is <15% again, since it can happen that the actual market price moves out of this range during the phase where Chainlink is misreporting the price. We thus need to take the fallback's oracle price into account when deciding to switch back to the main oracle, else we risk that we can never return.

The question then is whether we should use the same threshold for returning or a different one. Using the same threshold would be easy to implement if we simple store the latest price in PriceFeed.sol (regardless whether it's provided by Chainlink or the fallback oracle), so the contract would always first check the new Chainlink price against the last stored price before even checking the fallback oracle. However, there may be arguments for using a smaller threshold for returning.

Tellor tipping contract

If we end up using Tellor as our fallback oracle, we could add a public function lastUpdateFromChainlink() to PriceFeed.sol, which basically returns the timestamp of the last price update that was based on the Chainlink price feed (rather than on a price coming from Tellor).

This would allow us to build a tipping contract anytime later, even after deployment of the core protocol. Everybody could send TRB tokens to the tipping contract and get their TRB back as long as they haven't been spent on tips. Otherwise, the share would be proportionally reduced (using our compounding stake formula).

The contract would have a public function sendTip(), which could be called once every 5 minutes (or whatever update frequency we want to ensure). The function would first call lastUpdatedFromChainlink() and only send the tip if more than y hours have elapsed since the last Chainlink update. If that's the case, it would send z% of the contract's total TRB to the Tellor oracle as a tip.

The fact that the TRB stakers/providers can get their TRB back (as long as they haven't been used for tips) should mitigate the tragedy of the commons issue.

cvalkan commented 3 years ago

Giving the Chainlink proxy multisig admin rights over PriceFeed.sol

Given that Liquity will be relying on Chainlink anyway, it could make sense to give the Chainlink proxy multisig the right to upgrade / replace the oracle address used in PriceFeed.sol.

Counter arguments:

cvalkan commented 3 years ago

Here's a possible algorithm for the fallback mechanism:

Screenshot 2021-01-20 at 14 20 26

If both the Chainlink oracle and the fallback oracle price are suspect (either due to a large time lag or an excessive deviation), the algorithm uses the average of both prices based on the logic :shit: + :shit: = :four_leaf_clover:.

An alternative would be to use the last valid price price_old or to take the (weighted) average of the old price and the new prices. There's not really a good solution for this worst case, unless we could fall back to a third oracle.

This algorithm can also be used in conjuction with a Tellor tipping contract, if we expose timestamp_Chainlink, so that the tipping contract could enable tipping when block.timestamp - timestamp_Chainlink > threshold_tipping, with thereshold_tipping being smaller than threshold_time.

RickGriff commented 3 years ago

That pseudocode logic largely makes sense to me.

The first piece I'm mulling over is how to return to the main oracle when the original switch to fallback was due to price deviation.

Switching back to main oracle

The only decent solution I can see is to switch back when the difference between the current main price and current fallback price is below a certain threshold. This assumes that the fallback oracle is reporting accurately (at least since from the moment it passed the sanity check). That's the best we can do, given we suspect the main oracle failed at some point in the past. In terms of threshold, we could use a small one - the difference between main and fallback should be less than a few percentage points when both are reporting accurately. Perhaps we use a 5% threshold.

IMO we shouldn't compare old and new prices to determine whether to switch back to the main oracle (unless I'm missing something). Rather we should compare main and fallback current prices, and switch back when they are close enough. That means that implementation-wise, when we're using the fallback oracle, we also need to keep getting the main oracle price (~10k gas cost for Chainlink, IIRC)

For extra caution we could perhaps have some "probation" period, where we make sure that a few consecutive main and fallback price pairs have <5% difference, before returning to the main oracle.

Failed sanity check failure scenario

Thinking about the case where the main price deviates >15% from previous, and the fallback doesn't pass the sanity check: I see the argument for taking the average of main & fallback price initially, though it leaves us in a difficult situation from there onwards. We suspect the main price is inaccurate, but the fallback seems even worse! We now can't really trust either.

It seems we must assume that the likeliest root cause for this is a genuine flash crash / major price discrepancies at source (e.g. exchanges), and that it's not the case that both oracles are really broken.

So perhaps we use the average of main and fallback, until the difference between them returns to <5%, then we switch back to the main oracle.

If the reason for the failed sanity check is that both oracles really are broken, there's little we can do: either neither get fixed, both get fixed (and synchronize), or only one gets fixed. An average is the best we can do, unless they synchronize. We should still switch primarily to one if the other formally fails though (bad timestamp, long outage, or price <=0 ).

RickGriff commented 3 years ago

Tellor tipping

Regarding Tellor tipping, why would we need the latest Chainlink price timestamp?

Assuming we've fallen back to using Tellor, we can just get the latest price from Tellor, but if it's out-of-date (as per its own timestamp) we want it to update ASAP.

As you say, we could use some community contract. IMO the sendTip() succeeds if 1) our main system has switched to the using Tellor as fallback and 2) Tellor's latest ETH:USD timestamp is sufficiently out of date. The tricky part is how much to send for tips - it will really depend on the fee market at the time, and there's no guarantees on the size of the TRB balance of our contract. Setting an arbitrary z% upfront may be enough to update the Tellor price, or not, depending on the market conditions and contract TRB balance.

In the worst case we can just tip the ETH:USD price directly ourselves, since that's a public function and we have no exclusive control over when the price updates. It could be expensive to sustain that though, depending on the Tellor tip market. I'd assume ETH:USD will be very regularly updated without our help, but there are no a priori guarantees on that.

cvalkan commented 3 years ago

IMO we shouldn't compare old and new prices to determine whether to switch back to the main oracle (unless I'm missing something). Rather we should compare main and fallback current prices, and switch back when they are close enough.

My original idea was to compare main and fallback current prices, but that would lead to extra bookkeeping and gas costs (as you mention). We can certainly do that as it improves the logic. I also like the probation period idea if it doesn't lead to too much extra complexity. If we go down this route, we could maybe also measure the moving average difference between the two oracles and apply an even lower threshold of say 2% for switching back.

It seems we must assume that the likeliest root cause for this is a genuine flash crash / major price discrepancies at source (e.g. exchanges), and that it's not the case that both oracles are really broken.

I think it also depends if the deviation from the current price is in the same or in different directions. If the deviation has the same sign, we could take the average and switch back when the difference drops below the threshold. But there's also an argument for simply taking the Chainlink price (at least if the deviation between the two oracles is not too high), assuming that the root cause is a genuine flash crash.

If the deviation has different signs, we're screwed. Given that we can't really trust any of the two price feeds, it could be better to not update the price at all (and leave the current price).

Regarding Tellor tipping, why would we need the latest Chainlink price timestamp?

The idea is that we can allow tipping even before reaching the timeout for the Chainlink oracle (threshold_tipping < threshold_time) to make sure there's a recent Tellor price in case we need to fall back to it. Of course, this comes at the expense of using the contracts funds pessimistically.

The tricky part is how much to send for tips - it will really depend on the fee market at the time, and there's no guarantees on the size of the TRB balance of our contract.

I agree, there's no good answer to that. We don't know what amount to tip in advance. It's also questionable how much people would be willing to deposit to the contract.

In the worst case we can just tip the ETH:USD price directly ourselves, since that's a public function and we have no exclusive control over when the price updates. It could be expensive to sustain that though, depending on the Tellor tip market. I'd assume ETH:USD will be very regularly updated without our help, but there are no a priori guarantees on that.

We don't really know how much people would be willing to deposit given that there's no clear direct benefit in return. One could try to enhance the incentive for early adopters by taking a cut from every depositor and redistribute it to the existing depositors. In any case, we (or any of our investors) could refill the contract with more TRB in case of an emergency.

RickGriff commented 3 years ago

Chainlink multi-sig admin control of PriceFeed.sol

I think your assessment is pretty sound - there's a benefit in that particular scenario where Chainlink want to shut down their system but also assist protocols who rely on them. In that case it would be to our benefit that they can point PriceFeed.sol somewhere else. That's one specific scenario though.

In addition to your points, one could argue that it's not so well aligned with our core values of immutability and decentralization. Chainlink already catches criticism from decentralization purists. From our comms perspective, it could seem like we're handing too much control over to Chainlink - I could envision a "Chainlink controls Liquity" negative message, which is really counter to our ethos.

cvalkan commented 3 years ago

In addition to your points, one could argue that it's not so well aligned with our core values of immutability and decentralization. Chainlink already catches criticism from decentralization purists. From our comms perspective, it could seem like we're handing too much control over to Chainlink - I could envision a "Chainlink controls Liquity" negative message, which is really counter to our ethos.

That's a good point. It's really questionable whether we should go down this route. Another point is that if Chainlink was to shut down their system while aiming to assist protocols who rely on them, they would probably announce their shutdown well in advance. In that case, we could potentially create a clone of our system with a new oracle and let people migrate.