pendulum-chain / pendulum-squids

The subsquid squids we use for Pendulum/Amplitude/Foucoco.
GNU General Public License v3.0
0 stars 0 forks source link

Research how reliable past diadata is #42

Closed ebma closed 2 weeks ago

ebma commented 5 months ago

Related to https://github.com/pendulum-chain/tasks/issues/192.

Please refer to the notion page for detailed explanation and discussion.

Not intended to be merged, ever. This PR is just for illustrative purposes so that the research results can be retraced.

How to run

npm install
sqd build
sqd down # make sure database is empty so we process from scratch
sqd up
sqd process:foucoco # analyse the prices on Foucoco

Once the squid has finished fetching all price events, and every X blocks, a new .json file (defined here) will be generated in the root directory.

NOTE: if the squid is stopped, the deviations result file will not persist. The squid must sync from beginning to the highest block.

Check the notion page, "Results on Amplitude" for the latest file with the collected data to avoid re-running the squid.

Run deviation_analysis.py for final plot and analytics. Installing python with Anaconda already has the required dependencies (Pandas, Matplotlib).

What's analyzed

The processing of the squid was changed. The idea is the following:

The price data that is provided to chain from the offchain worker is exposed by the batching server. And the batching server receives the data from one of the many data sources.

Results

Find context and results in this Notion page.

ebma commented 5 months ago

@pendulum-chain/devs please have a look at the notion page linked in the description. It contains first results and some evaluation of them. Depending on your feedback, I can change/extend the logic for the analysis if needed.

TorstenStueber commented 4 months ago

@ebma I left a few comments on the notion page.

I did not execute this branch locally. Can you quickly explain how it works when you run it? Does it ingest all blocks from the beginning and where is the data from? From the RPC nodes or are all the blocks mirrored on some Subsquid server and it queries them from there?

ebma commented 4 months ago

Thanks for the comments. I updated the description.

gianfra-t commented 3 months ago

@TorstenStueber @ebma I ran the the analysis again with a slightly different approach (version 2 in the notion), the results are also there. There may be a mistake with the analysis or the assumptions, but I see some some very large deviations with the Kraken data particularly for KSM and XLM, which makes some sense given the higher volatility compared to the FIAT assets, but still should not be that high.

I could not get data for the remaining FIAT currencies.

TorstenStueber commented 2 months ago

There may be a mistake with the analysis or the assumptions, but I see some some very large deviations with the Kraken data particularly for KSM and XLM, which makes some sense given the higher volatility compared to the FIAT assets, but still should not be that high.

@gianfra-t I actually find the XLM and EUR-USD derivation you found quite okay. The AUS-USD on the other hand is quite bad. Do we know more about this? Is it fiat AUS-USD or is it some kind of stable coin prices traded within Kraken? There is a good chance that Kraken is off here instead of DIA.

gianfra-t commented 2 months ago

I assume it is FIAT prices, since token prices would specify the name or contract of the token. Since dia prices are a collection of scrappers/sources maybe we should relax our assumption that Kraken data is the ground truth and treat it more as a sanity check.

ebma commented 2 weeks ago

Closing this as our research is done and this was never intended to get merged.