witnet / witnet-rust

Open source Rust implementation of the Witnet decentralized oracle protocol, including full node and wallet backend 👁️🦀
https://docs.witnet.io
GNU General Public License v3.0
180 stars 56 forks source link

Enrich Radon aggregation scripting #2325

Open guidiaz opened 1 year ago

guidiaz commented 1 year ago

Radon scripting at the retrieval phase has shown to be highly effective on crawling and extracting concrete data from both HTTP/GET and HTTP/POST sources. It's this kind of radon scripting plus the current reducing mechanism at the aggregation phase that fundamentally enables the implementation of "basic" price feeds.

However, there are some interesting use cases that just cannot be tackled at the moment:

Currently, upon start of the aggregation phase an array of values is composed with every primitive value returned from every source during the retrieval phase, over which the following math is applied (assuming that all elements within the aggregation array are operable with each other):

In practice, the RADAggregate struct is currently defined at the protobuf level as:

  message RADAggregate {
      repeated RADFilter filters = 1;
      uint32 reducer = 2;
  }
  message RADFilter {
      uint32 op = 1;
      bytes args = 2;
  }

All this said, here comes a possible proposal for enriching the aggregation scripting mechanism:

tmpolaczyk commented 1 year ago

Overall I think this is a good idea, here are my thoughts on this after discussing this with @guidiaz.

Prefer RADON to witscript

New RADON operators

The behavior of those new operators is described in the examples below, and some of them are implemented in this branch:

https://github.com/tmpolaczyk/witnet-rust/tree/radon-alter-aggr

Aggregation scripts:

Tally scripts:

Probably not possible to have arbitrary scripts, as that would interfere with the reputation system (slashing of liars).


The following examples try to implement the source and aggregation stages of the example use cases:

Example 1: Solving multiple price feeds within one single data request

Sources: to separate btc/usd from eth/usd sources, add a new operator as the last operator of the source scripts:

12345.67
>>> [AsMapWithKey, "btc/usd"]
{"btc/usd": 12345.67}

Then, the output of each source will look like either one of:

{"btc/usd": 12345.67}
{"eth/usd": 1234.67}

And the input of the aggreagation stage will be an array of maps, like

[{"btc/usd": 12345.67}, {"eth/usd": 1234.67}, {"btc/usd", 13333.67}]

The aggregation script can then start with a ArrayOfMapsIntoMapOfArrays operator:

{"btc/usd": [12345.67, 13333.67], "eth/usd": [1234.67]}

And by using MapAlter operator, each map key can be filtered and reduced independently:

{"btc/usd": 13000.0, "eth/usd": 1234.67 }

In this example we assume that each source returns the price of one asset, however that's not mandatory. As long as it is possible to transform the response JSON into an object with one key per asset, it should work just as fine with the ArrayOfMapsIntoMapOfArrays operator, as if they were separate sources.

Example 2: Price feed composition (e.g. btc/usd * usd/eth = btc/eth):

Similar to Example 1, but now the aggregation stage needs an additional operator to convert

{"btc/usd": 16000, "eth/usd": 1600}

Into 10.0 by calculating 16000 / 1600.

One way to do it is to convert 1600 into (1600^-1), and multiply both numbers. We can achieve that using the MapAlter operator, combined with FloatPower(-1) to calculate 1/1600. Then, convert the map into an array, and use ArrayReduce(Product) to multiply all the values of the array:

{"btc/usd": 16000, "eth/usd": 1600}
>>> [MapAlter, "eth/usd", [ [ArrayAlter, 1, [ [FloatPower, -1] ]] ]]
{"btc/usd": 16000, "eth/usd": 0.000625}
>>> MapValues
[16000, 0.000625]
>>> [ArrayReduce, Product]
10.0

Example 3: Price averaging using trade volume data as weight

Instead of making each source return the price, we also want them to return the volume:

[1234.67, 111111111.0]

This array of [price, volume] can be created using the MapAlter, ArrayAlter, MapFilter, and StringMatch operators.

Then the input of the aggregation stage will be an array of arrays, like:

[[ 2.091, 9012233.0 ], [ 1.011, 12345.7 ], [ 2.045, 555323 ], [ 2.12, 1234.56 ]]

To be able to apply filters independently to price and volume, we need a new operator, ArrayFilterBy. We also need a new reducer, WeightedAverage, which will convert this array of [price, volume] into the weighted average, and the total volume:

[[ 2.091, 9012233.0], [2.045, 555323 ]]
>>> [ArrayReduce, WeightedAverage]
[ 2088.3300539, 19980214738.0 ]

In case the volume data is fetched from a different source than the price data, the sources will need to use the AsMapWithKey operator to tag which exchange they correspond to. For example, the first "exchange1" is the price, while the second "exchange1" is the volume, so the input of the aggregation stage may look like:

[{"exchange1": 2.091}, {"exchange2": 1.011}, {"exchange1": 9012233.0}, {"exchange2": 12345.7}]
>>> ArrayOfMapsIntoMapOfArrays
{"exchange1": [2.091, 9012233.0], "exchange2": [1.011, 12345.7]}
>>> MapValues
[[ 2.091, 9012233.0], [2.045, 555323 ]]
>>> [ArrayReduce, WeightedAverage]
[ 2088.3300539, 19980214738.0 ]
>>> [ArrayGet, 0]
2088.3300539
aesedepece commented 1 year ago

@tmpolaczyk brilliant job there! Really looks like logical next evolution for RADON, and the materialization of what we always wanted it to be.

From a technical standpoint, I want to see this landing sooner or later. Then, we could slowly rework our current feeds so that they leverage this new construct (in the cases that it's a clear advantage).

From a strategic standpoint, however, I'm a bit divided. I believe that at this point we should optimize for adoption. So while I want this implemented asap for the sake of completion and some theoretical operating costs cuts, I can't see it having a significant impact on adoption and success :thinking:

guidiaz commented 1 year ago

Precisely, the origin of the whole idea was to be able to attend composable price feeds (use case #2), as they are quite much demanded. Solving composable price feeds at the smart contract level is both expensive and unreliable.