kadena-io / chainweb-data

Data ingestion for Chainweb.
BSD 3-Clause "New" or "Revised" License
14 stars 8 forks source link

Refactor richlist generation #89

Closed emmanueldenloye closed 1 year ago

emmanueldenloye commented 2 years ago

Let's run the actual sqlite commands using the sqlite DSL provided by pact. We want to do this so that a user needn't run chainweb-data from the chainweb-data source directory. Additionally, if possible we can avoid needlessly copying sqlite tables (We'll see!).

jwiegley commented 2 years ago

I love it.

jwiegley commented 2 years ago

LGTM

enobayram commented 2 years ago

Thanks for the changes, here's all that remains to do in this PR as far as I can tell:

jwiegley commented 2 years ago

Yes, doing the comparison with previous behavior should be what we need. I wouldn't make this a separate executable, though; isn't it just a function to be used by chainweb-data?

enobayram commented 2 years ago

Here's how we've decided to test that PR manually:

NOTE: since this test depends on the transfers table, we need to merge #76 first.

emmanueldenloye commented 1 year ago

Here's how we've decided to test that PR manually:

  • Generate richlist r1 at time t1
  • Wait for a few minutes to let some new blocks appear and generate another richlist r2 at time t2
  • Diff r1 and r2 and look for a changed row row
  • Run a query against the transfers table to find some transfer events that explain the difference in row.

NOTE: since this test depends on the transfers table, we need to merge #76 first.

After generating the richlist twice as requested, we produce the following "diff"

[nix-shell:~/chainweb-data]$ diff richlist.csv richlist-old.csv
12c12
< 2,875e4493e19c8721583bfb46f0768f10266ebcca33c4a0e04bc099a7044a90f7,5725820,1323455.53348232
---
> 2,875e4493e19c8721583bfb46f0768f10266ebcca33c4a0e04bc099a7044a90f7,5725562,1326696.05041926
15c15
< 2,4iBIX0hsSprc7sYvUKLLlMd7_1uVEb6eheF33VBv1p0,5725785,1006856.1130231908
---
> 2,4iBIX0hsSprc7sYvUKLLlMd7_1uVEb6eheF33VBv1p0,5725627,1006627.6097206498
23c23
< 0,6d87fd6e5e47185cb421459d2888bddba7a6c0f2c4ae5246d5f38f993818bb89,19370317,454653.9489733537
---
> 0,6d87fd6e5e47185cb421459d2888bddba7a6c0f2c4ae5246d5f38f993818bb89,19369702,454649.79258135374
98c98
< 2,k:75d122a9f7f85d6202a9dfa4da3b348ba0cd8209eb03b988420ff4ad79480388,5725830,145796.82886405
---
> 2,k:75d122a9f7f85d6202a9dfa4da3b348ba0cd8209eb03b988420ff4ad79480388,5724978,145554.87844405

[nix-shell:~/chainweb-data]$ echo "145796.82886405 - 145554.87844405" | bc
241.95042000

In this example, we are examining the change in the balance of the account "k:75d122a9f7f85d6202a9dfa4da3b348ba0cd8209eb03b988420ff4ad79480388". In every row of richlist.csv, we print the 4-tuple (chain-identifier, account-name, txid, balance). We performed a diff of two different richlists formed from different times, and we can see a difference of about 241 kda for the aforementioned account. We print the txid for each row, so that we can track the blockheight range over which the balance change occurred. We use the following query to get blockheights from txids

chainweb-data=> select height from transactions where txid IN (5725830,5724978);                                             
 height
---------
 3246534
 3246769
(2 rows)

From there, we issue the following query to verify the change in balance recorded above

chainweb-data=> select sum(amount) from transfers where height <= 3246769 and height > 3246534 and (from_acct = 'k:75d122a9f7f85d6202a9dfa4da3b348ba0cd8209eb03b988420ff4ad79480388' or to_acct = 'k:75d122a9f7f85d6202a9dfa4da3b348ba0cd8209eb03b988420ff4ad79480388') ;                                                                                                              
    sum
-----------
 241.95042
(1 row)