cowprotocol / dune-bridge

Other
4 stars 3 forks source link

[Part 1] Paginated Sync of Order Rewards in Dune #33

Closed bh2smith closed 1 year ago

bh2smith commented 1 year ago

This PR implements what is needed to push a user generated view with all orderbook reward data to dune in a single table. The amount of data is already (after just a few weeks) ~80k records with a size of ~20 Mb.

Dune (legacy) API Returns 413 Response (Payload Too large) at what appears to be 1Mb.

After careful consideration, it seems the only possible approach is to do something paginated! In fact this actually works quite well. We partition the record set into chunks of size 3000 (approximately 0.73Mb) and push them in pages as follows:

CREATE OR REPLACE VIEW cow_rewards_{ENV}_page_0...
CREATE OR REPLACE VIEW cow_rewards_{ENV}_page_1...
...
CREATE OR REPLACE VIEW cow_rewards_{ENV}_page_N... (currently N = 27 after only a couple of weeks)

followed by building another view as the union of these tables:

CREATE OR REPLACE VIEW cow_rewards_{ENV} as (
    select 0 as page, * from cow_rewards_{ENV}_page_0
    union
    select 1 as page, * from cow_rewards_{ENV}_page_1
    union 
    ...
    select N as page, * from cow_rewards_{ENV}_page_N
)

Note that page numbers are included for the purpose of a follow up PR that optimizes this process by performing a checksum comparison to see which pages actually need to be updated. In most cases, we will only ever have to update the last page, and sometimes append a new one.

What this means for other services ⚠️

I was actually surprised that we are able to push app data so I checked out its file size We will not be able to do this for raw app data much longer either. Currently the size of that query is 628 Kb. This means we are at ~62% of our capacity for this solution. This means we will eventually have to migrate to a paginated solution for this as well. Luckily this actually works very well and we could potentially start generating more unique app hashes.

Follow Up

In a follow up PR we will address the issue of optimizing this sync by evaluating a checksum of the data to determine if older pages require update or if only the new data can be appended.

Test Plan

PoC Query - demonstrating that the table is available. Check the size of the test table on Dune (containing 80k records) with

select count(*) from dune_user_generated_cow_order_rewards_test

you can also see all these pages created:

Screenshot 2022-10-30 at 7 33 09 PM

Try it yourself with

python -m dune_api_scripts.update.order_rewards

You will need to set

  1. Dune Credentials
  2. Orderbook Credentials from both databases.

Furthermore

There is already a nice Dune Query built on top of this data that shows the per batch rewards

harisang commented 1 year ago

@harisang how important is it that solvers have this dashboard soon and that it is in dune? Maybe we could build in one month a better technical solution with own infrastructure. I am really just thinking out loud. Not sure what is best.

Only one solver brought it up once. It's definitely nice to have but I wouldn't say it's very high priority. So, I would suggest that we go with the "best" solution, even if that takes a bit longer to realize. Solvers can always do their own accounting in the meantime.

bh2smith commented 1 year ago

Moving this PR over to solver-rewards project (since it doesn't really belong here and doesn't require the persistent storage).

https://github.com/cowprotocol/solver-rewards/pull/123