Closed bh2smith closed 1 year ago
Why have you decided to go for a block_range introduction? Could you motivate it a little bit.
What are the advantages over the run-loop proposal?
The problem I am seeing is that appData does not necessarily belong to blocks. The time delay between APPdata creation and tx mining could be several minutes. This is why we used to pull the newest app_data already from the API and started to download it already. Happy to have a sync chat...
Thanks @josojo for your feedback and questions. We synced about this offchain and I it seemed as if you were content with the approach taken here. There was a lot discussed that I can't write here so I will just try to briefly address the questions from above.
Why have you decided to go for a block_range introduction?
Because this is an immutable index on the content. It is uniquely determined from the time appData
first appears in GPv2_call_settle
. Previously we had also fetched this from the orderbook, but this doesn't serve any purpose since the affiliate data (such as trading volume) associated with appContent
(i.e. referral address) is not going to be available until long after the trades have been finalized on chain (Dune V1 ~30 minute time delay for prices, Dune V2 > 1 hour).
What are the advantages over the run-loop proposal?
Run loop index does not persist between cronjobs (also cronjobs don't have a runloop) - block number can be persisted between two distinct executions and is more meaningful than an incremental index.
The end goal here is that the API (rust) service - the one with the run loop will no longer be responsible for fetching or indexing app content, but rather it will rely on the content fetched from Dune API via query by appHash
and caching with expiry logic. Because of this, there will no longer be a "runloop" for the appData sync.
I agree these things could be moved, but I'd like to see them in good shape before wasting more time going back and forth updating the dune client. I don't believe either of these two "features" to be finalized. Is it alright if we keep it here for now?
So in essence, the QueryData
doesn't really belong in the client package just yet or at least I don't believe we have landed on a finalized version of this particular data structure or its functionality.
AsyncDuneClient
cf - here: https://github.com/cowprotocol/dune-client/pull/31#issuecomment-1316045313 and here: https://github.com/cowprotocol/dune-client/issues/38But it also comes with some assumptions: No reorgs in dune, dune uploads the data atomic per block, and dune provides data of block n only after providing the info for all previous blocks.
I think its safe to assume some of these things for now as we develop. Hopefully the code is structured in a way that would allow us to come back and fix things if our assumptions turn out not to be true (this can all easily be identified with a our Query Monitoring Service).
Perhaps you could boost my questions to Dune and add a some of these additional ones in our conversation with them. I will relay them to the folks who are helping with the sync.
Still looking for an approval on this one...
This PR introduces a DuneFetcher and two methods for fetching App Data via Dune. The goal here is to fetch and dump unique app hashes for a block range that have not yet been seen.
The main logic is in
main.py
Each time the script is run (say, as a cron job) it will:
BlockRange(from, to)
for which it should query dune for new app hashes. block_from is read from storage as thelatest_synced_block
(seen at the end of this bulleted list) andblock_to
is the latest block indexed by dune from the table containing app hashes (GPv2_call_settle).BlockRange.block_to
as the last synced block.Example
After running this once, you will see something like these two files written to storage:
and sync block will be updated with this:
There are not many new app hashes incoming, but when they do, you would expect to see a new
app_hashes
file created with only the new entries.Test Plan
You can run it yourself with
python -m pysrc.main
, but I will add a unit test demonstrating how the pieces work together.