Consensys / defi-score

DeFi Score: An open framework for evaluating DeFi protocols
https://defiscore.io
Other
279 stars 79 forks source link

Supporting Oracle Manipulation Metric #49

Open jclancy93 opened 3 years ago

jclancy93 commented 3 years ago

Given the recent exploitation of the Compound money markets, community members have raised concerns with Compound receiving a higher score than Aave, despite having more easily manipulatable oracles.

To address this, we want to add a new component to the financial score, which is the cost of oracle manipulation. There are a couple of different ways to measure this oracle risk, depending on the platform:

1) Compound - Order Book based Oracle - what is the cost to +/-10% the price? 2) If a protocol was using Chainlink, what are the underlying sources and the incentives to report prices honestly? 3) If using an on-chain TWAP oracle like Uniswap - what is the cost to +/- 10% of the price?

An acceptable solution to this bounty would be a general design for how to measure oracle price manipulation across all of these providers in a fairly consistent way. Additionally, some scratch code would be nice but is not absolutely required.

gitcoinbot commented 3 years ago

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


This issue now has a funding of 0.3265 USD (0.33 USD @ $1.0/USD) attached to it.

gitcoinbot commented 3 years ago

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work has been started.

These users each claimed they can complete the work by 265 years, 8 months from now. Please review their action plans below:

1) javipus has been approved to start work.

If I understand this correctly, I think you're pointing to two different problems, with significantly different complexities:

  1. Figuring out the cost to change the price of an asset by a certain amount should be fairly easy. It can be read off the constant product formula in Uniswap and I assume a similar approach would work for order books.

  2. Understanding the incentives traders face, and predict under what conditions it will be in their best interest to manipulate the market or report false prices. I think this amounts to finding an equilibrium strategy in a complicated game, which is far from trivial, although maybe some insight could be gained by analyzing simplified models.

I suggest tackling (1) first by simply looking at the Uniswap spec and re-implementing the relevant parts. For (2) I think you could either try to find analytical solutions to simplified problems or run a more complex simulation to find numerical solutions. Either way, I expect a fair amount of modelling will be needed.

Does this make sense to you?

Learn more on the Gitcoin Issue Details page.

jclancy93 commented 3 years ago

@javipus let's discuss here since I'm unsure how to respond to your requests on Gitcoin. You look like you have a strong quantitative background, so looking forward to working with you 😄

On your 2nd point, I actually think the solution is more simple than you think. A trader should always be willing to manipulate the market, so long as the profit is greater than the costs to manipulate. UMA has some good stuff on this in their DVM oracle docs:

https://docs.umaproject.org/oracle/econ-architecture

Given a pool's liquidity on Uniswap, you can use x * y = k to figure out what trade size is needed to +/- 10% the price.

We're aligned that order book models should be pretty similar. Protocols like Aave that use Chainlink are going too be interesting as well. Because the Chainlink oracles use a mix of order book and CFMM prices. For cases like these, A volume weighted cost of corruption probably makes sense.

Let me know if you have any questions or want to suggest alternate approaches!

javipus commented 3 years ago

Thanks for pointing me to UMA. I had read that page before but didn't dig very deep. This time I clicked around a bit more and I found this research paper where they lay out their model in more detail. It seems to me that it's still very much a work in progress, but perhaps more manageable than I had assumed.

Regardless of that, I now believe that your threat model is slightly different, as the attacker would be manipulating the price directly, so in a DVM-like oracle, token holders would be reporting honestly from their point of view and wouldn't need to be bribed. Also, I assume you want to consider oracles that report on-chain price data directly, without any human voters in the loop. Is this so?

If that's the case, I would re-state the problem as:

Calculate the cost an attacker would face if they wanted to move the price of an asset from p to p' for each of the following oracles: Uniswap, Compound's Open Price Feed, and Chainlink. This willfully ignores why they want to do it, i.e. we're only calculating the cost of corruption, not the profit from corruption.

Let me know if you agree with that characterization so we can start discussing deliverables and timeline.

jclancy93 commented 3 years ago

Also, I assume you want to consider oracles that report on-chain price data directly, without any human voters in the loop. Is this so?

Yea that's correct. If I understand correctly, that would be any oracle that pulls directly from Uniswap spot pricing data or any other CFMM. Not many protocols use this type of data as an oracle since it's so easily manipulatable, but I would like to establish this as a kind of minimum baseline.

Calculate the cost an attacker would face if they wanted to move the price of an asset from p to p' for each of the following oracles: Uniswap, Compound's Open Price Feed, and Chainlink. This willfully ignores why they want to do it, i.e. we're only calculating the cost of corruption, not the profit from corruption.

Yes, this characterization is more or less what I was thinking. It ends up looking quite similar to a +/- % depth statistic you would see on an exchange data website.

If you have anything you think would improve the problem statement or the conclusions we could draw from it, feel free to add!

javipus commented 3 years ago

No, I think this is a reasonable framing. Let me know what an MVP would look like in your opinion. Personally, I would start fleshing out these ideas in a jupyter notebook and go from there, but maybe you have something else in mind.

jclancy93 commented 3 years ago

@javipus I think a Jupyter notebook is a great place to start 👍

javipus commented 3 years ago

Here is a first stab. I covered a single Uniswap trade. Some possible next steps:

  1. Doing the same analysis from the point of view of a liquidity provider instead of a trader
  2. Exploring how this interacts with time averaging in a more realistic setting (TWAP)
  3. Trying to estimate how fast/likely one of these trades is to be arbitraged away

1 is straightforward, 2 is easy but would require me to dig into the TWAP implementation, and 3 would need a lot of modelling and it's outside of the scope of what we discussed but I wanted to include it anyway.

Let me know what you think!

gitcoinbot commented 3 years ago

@javipus Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

jclancy93 commented 3 years ago

Work on Uniswap looks good. I think modeling out the cost of manipulating the TWAP is also interesting.

If you assumed markets were perfectly efficient, AKA arbitrageurs would drive the price back up to FMV within a single block, then cost of corruption would be single block CoC * # of blocks. In practice that would never happen, but at the very least we can reason that the TWAP oracle should always be safer than a spot oracle.

Curious about your thoughts on the above.

It would also be interesting to run the numbers on some real Uniswap pools (WBTC/ETH, ETH/DAI, ETH/USDC) to get a feel for what this looks like on mainnet.

Don't worry about the Gitcoin bot notification btw, I have disabled it for another 5 days and I'm happy with how the work is progressing here 👍 🙏

jclancy93 commented 3 years ago

Btw, I'm not sure if Compound Open Oracle system is worth looking into anymore, since they are thinking about moving to Chainlink oracles as well. So let's just focus on Uniswap and Chainlink for now

gitcoinbot commented 3 years ago

@javipus Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

javipus commented 3 years ago

Hi! I've added some utility functions to get prices from Uniswap. I will be updating the figures in the notebook today using that data.

If you assumed markets were perfectly efficient, AKA arbitrageurs would drive the price back up to FMV within a single block, then cost of corruption would be single block CoC * # of blocks

I'm not sure what the threat model is in this case. Are you assuming the attacker would want to revert the transaction? If not, why would the cost scale with number of blocks? My intuition is that, if they don't intend to revert any transaction and the initial Uniswap price is correct, the only cost comes from being arbitraged.

BTW, I found this paper earlier today. It addresses all these issues formally in some depth so I'm gonna give it a read.

I will ping you when the new figures are ready.

jclancy93 commented 3 years ago

If you assumed markets were perfectly efficient, AKA arbitrageurs would drive the price back up to FMV within a single block, then cost of corruption would be single block CoC * # of blocks

I actually got this idea from that paper authors! I think the same threat model still applies in this theoretical example. It's just a way to handle the case of trying to estimate the CoC for an AMM price over multiple blocks (albeit a very imperfect estimation). Here is an excerpt from them:

In fact, it turns out that the cost of manipulating the Uniswap price to any fixed amount scales linearly with the reserves and the number of blocks, which can be expensive in many practical cases, though we note that very small or short-term perturbations to the price are relatively cheap.


My intuition is that, if they don't intend to revert any transaction and the initial Uniswap price is correct, the only cost comes from being arbitraged.

Exactly

You should also check this article out, by the same authors as the paper you linked

I will ping you when the new figures are ready.

Sounds good. Thanks!

javipus commented 3 years ago

You should also check this article out, by the same authors as the paper you linked Will do!

I got one figure ready comparing the cost of manipulating pools of the form X/DAI for different X. Some thoughts on this:

  1. I had to manually adjust the units returned by the getReserves method of the uniswap pool contracts. Some tokens are expressed in wei but others (WBTC and USDC) aren't. Not sure what to make of this.
  2. I think ideally we want to measure the cost of manipulation in USD across all pools. This means we either need a price oracle involving USD or we approximate it by a stablecoin pegged to it. I went for the latter in the example figure.

Next steps:

  1. Do the same exercise using liquidity provision instead of trading. I expect this to be less capital efficient, but it'd be interesting to quantify by how much and to what extent higher fees would make the attack more attractive.
  2. Continue research on Chainlink. So far I have only skimmed their docs.
jclancy93 commented 3 years ago

I had to manually adjust the units returned by the getReserves method of the uniswap pool contracts. Some tokens are expressed in wei but others (WBTC and USDC) aren't. Not sure what to make of this.

This is because WBTC and USDC have 8 and 6 decimals, respectively. So instead of dividing raw balances by 10e18 like with Ether, you would divide their raw balances by 10e8 and 10e6. This decimal variable is available on any ERC20 smart contract. They're almost always 18, but there are a few notable exceptions like USDC and WBTC.

I think ideally we want to measure the cost of manipulation in USD across all pools. This means we either need a price oracle involving USD or we approximate it by a stablecoin pegged to it. I went for the latter in the example figure.

This makes total sense and next steps sound good 👍

gitcoinbot commented 3 years ago

@javipus Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

gitcoinbot commented 3 years ago

@javipus Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

javipus commented 3 years ago

This decimal variable is available on any ERC20 smart contract.

Apparently that's an optional parameter in the ERC20 spec, and the USDC contract does not implement it. However, it can be accessed through the "Read as Proxy" feature on etherscan, but I don't know how that works or how to use it from web3.py. So to sum up, right now I'm trying to do contract.functions.decimals and falling back to a hardcoded number if that doesn't work.

I have implemented the liquidity provision strategy for Uniswap and, as we expected, it's more capital intensive (compare dashed to solid lines in the main plot). I am also experimenting with interactive plots using ipywidgets. They're a very cheap way to make notebook code into a reasonably rich UI. Let me know if you think that's valuable.

I will be moving on to the Chainlink stuff now. We haven't discussed any concrete timeline, so I've been prioritizing other projects. If you want to speed this up just tell me and I can put those other things on the backburner instead. Otherwise, I will stick to being responsive to the Gitcoin bot's kind nudges :)

gitcoinbot commented 3 years ago

@javipus Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

gitcoinbot commented 3 years ago

@javipus Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

javipus commented 3 years ago

Hey @jclancy93 just pinging you to make sure you got my last message and to appease the bot. I'll be giving you Chainlink updates later this week.

jclancy93 commented 3 years ago

@javipus yes all good! Sorry about the late response. Appreciate you being flexible with the timelines. Feel free to work on this at your own convenience.

Apparently that's an optional parameter in the ERC20 spec, and the USDC contract does not implement it. However, it can be accessed through the "Read as Proxy" feature on etherscan, but I don't know how that works or how to use it from web3.py. So to sum up, right now I'm trying to do contract.functions.decimals and falling back to a hardcoded number if that doesn't work.

This should be sufficient for the time being 👍

I have implemented the liquidity provision strategy for Uniswap and, as we expected, it's more capital intensive (compare dashed to solid lines in the main plot). I am also experimenting with interactive plots using ipywidgets. They're a very cheap way to make notebook code into a reasonably rich UI. Let me know if you think that's valuable. I will be moving on to the Chainlink stuff now. We haven't discussed any concrete timeline, so I've been prioritizing other projects. If you want to speed this up just tell me and I can put those other things on the backburner instead. Otherwise, I will stick to being responsive to the Gitcoin bot's kind nudges :)

The notebook is great. Do I need to spin up the notebook locally to be able to play with the interactive charts? I don't see them loading correctly on Github

javipus commented 3 years ago

Yes, you'd need to download it. GitHub only displays the static version I think.