Closed xdzurman closed 2 years ago
This is almost certainly related to https://github.com/input-output-hk/cardano-db-sync/issues/696 .
db-sync
version 9.0.*
and 10.0.*
are known to have issues with rewards, and these are unlikely to be fixed in time for 11.0.0
.
This was actually caused by faulty catalyst rewards, where in epoch 285, two catalyst rewards showed up for all participants, but the first of the two did not make it through to node and cli, but it appeared in db-sync, which threw balances off. Samuel Leathers can tell you more if interested.
I think I have a fix for #696 (still under test). @xdzurman do you have an specific address with a reward amount for epoch 285 that I can test?
two catalyst rewards showed up for all participants, but the first of the two did not make it through to node and cli, but it appeared in db-sync,
The node
uses the same code as db-sync
, so how could a reward payment make through to db-sync
but not the node
?
I think I have a fix for #696 (still under test). @xdzurman do you have an specific address with a reward amount for epoch 285 that I can test?
Let's say stake1u9schrkl2ed8daympxqxd38z72kx00y69ky8udd7g4v53kcp5kkyg. The cardano-cli reward amount was
cardano-cli query stake-address-info --mainnet --address stake1u9schrkl2ed8daympxqxd38z72kx00y69ky8udd7g4v53kcp5kkyg
[
{
"address": "stake1u9schrkl2ed8daympxqxd38z72kx00y69ky8udd7g4v53kcp5kkyg",
"rewardAccountBalance": 10750235,
"delegation": "pool1j6490a90qefvsqxdurf99z9ld50rtthaz9ms6c8xh78s52veuq2"
}
]
But if you calculate the rewards from db-sync, the outstanding reward balance is 10793839. The difference is 43604, which corresponds to the first of the two catalyst rewards for epoch 285.
two catalyst rewards showed up for all participants, but the first of the two did not make it through to node and cli, but it appeared in db-sync,
The
node
uses the same code asdb-sync
, so how could a reward payment make through todb-sync
but not thenode
?
Quoting Samuel Leathers:
okay, so @jared.corduan had a late night debugging and we see what the issue is. There's actually multiple, most are my fault, one was a misunderstanding in how MIR's worked. To start off, first mistake I made was copy of wrong rewards distributed. My general QA for testing the json file is I check number of lines in it, and then post it to the channel for others to review. No one caught that the total rewards amount was wrong. In the future, I have a script now that sums the rewards and outputs a total value being distributed. Second was I executed the script with wrong pot. This was because of a MIR we did from reserves to account for a bug in rewards calculation prior to allegra. I used the same script, and didn't switch it back from reserves -> treasury. Third is where things get really hairy. I made the false assumption multiple MIR's would accumulate, but in fact they replace (but only prior to alonzo, they accumulate post-alonzo). @jared.corduan figured this out last night comparing the output of his stake address tool to rewards total balance. So essentially, everyone's been paid the second MIR but not the first. I'll be formalizing this into a proper issue, but lots of lessons learned on this one. If we'd like to have a call to discuss it, we can bring @jared.corduan in to explain technically what's happening if this isn't enough detail.
@xdzurman Unfortunately, your cardano-cli query
returns an rewardAccountBalance
which of course changes over time.
db-sync
on the other hand stores credits to the reward account in the reward
table and debits in the withdrawal
table.
On current master (after merging of the reward validation) I get:
select reward.* from reward inner join stake_address on reward.addr_id = stake_address.id
where stake_address.view = 'stake1u9schrkl2ed8daympxqxd38z72kx00y69ky8udd7g4v53kcp5kkyg'
order by earned_epoch ;
id | addr_id | type | amount | pool_id | earned_epoch | spendable_epoch
----------+---------+----------+---------+---------+--------------+-----------------
13040354 | 1895674 | member | 353659 | 7699 | 273 | 275
13651004 | 1895674 | member | 238593 | 7699 | 274 | 276
14271526 | 1895674 | member | 940700 | 7699 | 275 | 277
14900823 | 1895674 | member | 1352813 | 7699 | 276 | 278
15547990 | 1895674 | treasury | 39108 | | 277 | 279
15547989 | 1895674 | member | 493762 | 7699 | 277 | 279
16210475 | 1895674 | member | 1371629 | 7699 | 278 | 280
16867777 | 1895674 | member | 1236043 | 7699 | 279 | 281
17533273 | 1895674 | member | 1004487 | 7699 | 280 | 282
18204762 | 1895674 | member | 1112734 | 7699 | 281 | 283
18883060 | 1895674 | member | 895610 | 7699 | 282 | 284
19567131 | 1895674 | member | 851842 | 7699 | 283 | 285
20268768 | 1895674 | member | 845867 | 7699 | 284 | 286
20268769 | 1895674 | reserves | 13388 | | 284 | 286
20986850 | 1895674 | member | 973891 | 7699 | 285 | 287
21694133 | 1895674 | member | 225910 | 7699 | 286 | 288
23036583 | 1895674 | member | 518683 | 7699 | 287 | 289
(17 rows)
and
select withdrawal.* from withdrawal inner join stake_address on withdrawal.addr_id = stake_address.id
where stake_address.view = 'stake1u9schrkl2ed8daympxqxd38z72kx00y69ky8udd7g4v53kcp5kkyg';
id | addr_id | amount | tx_id | redeemer_id
----+---------+--------+-------+-------------
(0 rows)
Does that seem correct?
Yes, this seems correct. Until now, we queried ITN and catalyst rewards from reserve
and treasury
tables, but this solves it. If you check rewards for the same address from the reserve
table, you can see two entries:
cardano=# select * from reserve where addr_id =1610222;
id | addr_id | cert_index | amount | tx_id
-------+---------+------------+--------+----------
54814 | 1610222 | 0 | 19405 | 10923521
86899 | 1610222 | 0 | 5958 | 10991467
one of which should not be there. That was the whole issue.
I think I have a fix for #696 (still under test). @xdzurman do you have an specific address with a reward amount for epoch 285 that I can test?
Also, can you check rewards for stake1ux025zswj96df8esu78wqdhvfc56yg9yzztwhj4hgkp0rqswd7rl7
? In db sync 11, all rewards from before epoch 257 are ignored.
Rewards from db sync 9.0.0:
id | addr_id | amount | epoch_no | pool_id | block_id
----------+---------+--------+----------+---------+----------
1523654 | 303094 | 6724 | 237 | 454 | 5150301
1620412 | 303094 | 3403 | 238 | 4040 | 5171559
1721179 | 303094 | 3492 | 239 | 4040 | 5192987
1825341 | 303094 | 2644 | 240 | 4040 | 5214386
1936927 | 303094 | 8057 | 241 | 4040 | 5235804
2055509 | 303094 | 7383 | 242 | 4040 | 5257390
2183876 | 303094 | 6234 | 243 | 4040 | 5278881
2321976 | 303094 | 7963 | 244 | 4040 | 5300366
2468914 | 303094 | 4472 | 245 | 4040 | 5321693
2619851 | 303094 | 5521 | 246 | 4040 | 5343201
2783288 | 303094 | 7723 | 247 | 9788 | 5364557
2960581 | 303094 | 5974 | 248 | 9788 | 5385933
3159241 | 303094 | 6625 | 249 | 9788 | 5406924
3378432 | 303094 | 4698 | 250 | 9788 | 5428248
3621517 | 303094 | 7799 | 251 | 9788 | 5449333
3885633 | 303094 | 20 | 252 | 454 | 5470255
4165967 | 303094 | 26 | 253 | 6678 | 5490968
4461010 | 303094 | 32 | 254 | 6678 | 5512319
5814262 | 303094 | 3618 | 257 | 7717 | 5576454
6570265 | 303094 | 1866 | 259 | 454 | 5619113
6966660 | 303094 | 1264 | 260 | 454 | 5640553
7376790 | 303094 | 853 | 261 | 454 | 5662037
7798795 | 303094 | 1477 | 262 | 454 | 5683430
8232488 | 303094 | 781 | 263 | 454 | 5704839
8677097 | 303094 | 1033 | 264 | 8956 | 5726575
14997015 | 303094 | 1521 | 271 | 454 | 5878959
9594796 | 303094 | 1151 | 265 | 8956 | 5748284
10066694 | 303094 | 948 | 266 | 8956 | 5770278
10557469 | 303094 | 627 | 267 | 454 | 5791997
15586281 | 303094 | 9 | 272 | 454 | 5900696
11602292 | 303094 | 706 | 268 | 454 | 5813820
12139677 | 303094 | 719 | 269 | 454 | 5835493
13257765 | 303094 | 615 | 270 | 454 | 5857193
16795160 | 303094 | 11 | 273 | 454 | 5922535
17406737 | 303094 | 10 | 274 | 454 | 5944326
18655144 | 303094 | 12 | 275 | 454 | 5965898
19919270 | 303094 | 10 | 276 | 454 | 5987504
20559745 | 303094 | 13 | 277 | 454 | 6009190
21210116 | 303094 | 12 | 278 | 454 | 6030813
21869264 | 303094 | 8 | 279 | 454 | 6052318
22534301 | 303094 | 10 | 280 | 454 | 6074054
23206785 | 303094 | 9 | 281 | 454 | 6095819
23885506 | 303094 | 12 | 282 | 454 | 6117580
24568320 | 303094 | 8 | 283 | 454 | 6139413
25259671 | 303094 | 8 | 284 | 454 | 6160945
25958140 | 303094 | 6 | 285 | 454 | 6182770
26667651 | 303094 | 13 | 286 | 454 | 6204160
28118640 | 303094 | 10 | 287 | 454 | 6225656
Rewards from db sync 11.0.0
id | addr_id | type | amount | pool_id | earned_epoch | spendable_epoch
----------+---------+--------+--------+---------+--------------+-----------------
5512286 | 312446 | member | 3618 | 7717 | 257 | 259
6269795 | 312446 | member | 1866 | 454 | 259 | 261
6666441 | 312446 | member | 1264 | 454 | 260 | 262
7075874 | 312446 | member | 853 | 454 | 261 | 263
7498675 | 312446 | member | 1477 | 454 | 262 | 264
7932392 | 312446 | member | 781 | 454 | 263 | 265
8377630 | 312446 | member | 1033 | 8956 | 264 | 266
8835356 | 312446 | member | 1151 | 8956 | 265 | 267
9307180 | 312446 | member | 948 | 8956 | 266 | 268
9798989 | 312446 | member | 627 | 454 | 267 | 269
10316273 | 312446 | member | 706 | 454 | 268 | 270
10853716 | 312446 | member | 719 | 454 | 269 | 271
11409671 | 312446 | member | 615 | 454 | 270 | 272
11984714 | 312446 | member | 1521 | 454 | 271 | 273
12575182 | 312446 | member | 9 | 454 | 272 | 274
13177526 | 312446 | member | 11 | 454 | 273 | 275
13790404 | 312446 | member | 10 | 454 | 274 | 276
14412845 | 312446 | member | 12 | 454 | 275 | 277
15043802 | 312446 | member | 10 | 454 | 276 | 278
15698747 | 312446 | member | 13 | 454 | 277 | 279
16357527 | 312446 | member | 12 | 454 | 278 | 280
17016870 | 312446 | member | 8 | 454 | 279 | 281
17683530 | 312446 | member | 10 | 454 | 280 | 282
18356209 | 312446 | member | 9 | 454 | 281 | 283
19035558 | 312446 | member | 12 | 454 | 282 | 284
19719657 | 312446 | member | 8 | 454 | 283 | 285
20430683 | 312446 | member | 8 | 454 | 284 | 286
21142437 | 312446 | member | 6 | 454 | 285 | 287
21851842 | 312446 | member | 13 | 454 | 286 | 288
22575761 | 312446 | member | 10 | 454 | 287 | 289
Could the fix be patch released until the hard fork?
@xdzurman how do you know db-sync-9
was correct? I strongly suspect it was not.
That is the version everyone rolled back to after 10.0.0 introduced rewards issues. It has been deployed to production for a long time and stood the test of time (except a minor tweak regarding epoch 285's catalyst rewards).
Running master against testnet, there are couple of these as well and one needs to disable panicAbort
to be able to fully sync from scratch
[db-sync-node:Warning:77] [2021-09-08 21:35:31.63 UTC] validateEpochRewards: rewards earned in epoch 84 expected total of 158768.12877 ADA but got 158468.12877 ADA
[db-sync-node:Warning:77] [2021-09-08 21:48:27.42 UTC] validateEpochRewards: rewards earned in epoch 107 expected total of 523029.532638 ADA but got 523129.532638 ADA
@xdzurman
That is the version everyone rolled back to after 10.0.0 introduced rewards issues. It has been deployed to production for a long time and stood the test of time (except a minor tweak regarding epoch 285's catalyst rewards).
It may well have stood the test of time, but it may still be wrong.
Running master against testnet, there are couple of these as well and one needs to disable
panicAbort
to be able to fully sync from scratch
If the panicAbout
is triggered, that is an indication of a bug. That is why it is a panicAbort
and not just a info level log message.
Running master against testnet, there are couple of these as well and one needs to disable
panicAbort
to be able to fully sync from scratchIf the
panicAbout
is triggered, that is an indication of a bug. That is why it is apanicAbort
and not just a info level log message.
Created #805
That is the version everyone rolled back to after 10.0.0 introduced rewards issues. It has been deployed to production for a long time and stood the test of time (except a minor tweak regarding epoch 285's catalyst rewards).
Yes, they may have rolled back to 9.0.0
thinking it was correct, but I am 100% sure it was not and has never been correct.
@xdzurman
select hash_raw from stake_address where view = 'stake1ux025zswj96df8esu78wqdhvfc56yg9yzztwhj4hgkp0rqswd7rl7' ;
hash_raw
--------------------------------------------------------------
\xe19eaa0a0e9174d49f30e78ee036ec4e29a220a41096ebcab74582f182
(1 row)
and then using the stake history tool in the node
respository:
./stake-credential-history -c configuration/cardano/mainnet-config.json -s state-node-mainnet/node.socket \
-t e19eaa0a0e9174d49f30e78ee036ec4e29a220a41096ebcab74582f182 --check-point-size 100000
shows that yes indeed, rewards for this address should start in epoch 238.
I think this is basically a duplicate of #769 .
That is the version everyone rolled back to after 10.0.0 introduced rewards issues. It has been deployed to production for a long time and stood the test of time (except a minor tweak regarding epoch 285's catalyst rewards).
Yes, they may have rolled back to
9.0.0
thinking it was correct, but I am 100% sure it was not and has never been correct.
I do not think that is the case. I mean, at least it was much more "correct" (like a fraction of a percent might have been incorrect). Otherwise we would see a lot more failed reward withdrawal transactions in Yoroi, as we would construct the transaction with incorrect reward in place and get rejected by the ledger. With 9.0.0, I really hardly recall any of it.
And as far as I know this has been the case with Adalite too, as they have upgraded from 9.0.0.
Same here. Users of adalite were always able to withdraw their funds while we were using 9.0.0 (we compute the reward amounts as a sum of all types of rewards minus withdrawals). This always worked, except the ones that retired their pools and got the deposit back, for which I created an issue more that 9 months ago. This issue (catalyst's epoch 285 double rewards) was the second major issue, but apart from that version 9.0.0 was correct (based on the Adalite's historical user logs and reported issues).
However, when switching to db synx 10.0.0, a huge portion of users had wrong rewards. These issues are still unresolved in 11.0.0, which forces us to show warning banners to our users that their reward history is most probably wrong. We also had to resort to ignore the reward sum from db sync and query it via ogmios (bloating our backends just for this one piece of information) so that the users can at least withdraw their funds. Ever since having to move from db sync 9.0.0, we are impatiently pleading for fixes to the major issues, so that we don't have to compromise the truthfulness of the data we serve to our users.
Hopefully this is fixed on https://github.com/input-output-hk/cardano-db-sync/releases/tag/13.0.0. Feel free to reopen if the issue persists. Most of the work for rewards is not done in leger through ledger events.
There is a mismatch between cardano-db-sync catalyst rewards and cardano-cli (ledger). The db sync states 2 catalyst rewards which appeared in epoch 285. However, the sooner one of them is invisible to cardano-cli. A few examples:
https://cardanoscan.io/stakekey/05c6aa84de1fd5d5d15fa2cc0f30a5c804f52fb19725525aab54a839?tab=instantaneousrewards Here you can see the two catalyst rewards which appeared 5 days ago. AFAIK cardanoscan simply sums up the rewards minus withdrawals. The reward sum with the two catalyst rewards for this address is 2649355 Lovelaces. According to cardano-cli,
"rewardAccountBalance": 2629950
. The difference is 19405 Lovelace, which corresponds to the sooner catalyst reward of the two.Same here - https://cardanoscan.io/stakekey/stake1uyzuqsnynkxkqs848y8mgylywwezrgcrmd6xmqspx4kfk7gqtlzjv?tab=instantaneousrewards Reward sum: 103826843 Cardano-cli reward: 103654676 Diff: 172167 = equal to the sooner of the two rewards
The same goes for basically any address you look up. Db sync version: 9.0.0 This caused Adalite users not being to withdraw rewards until we started filtering out the first of epoch 285's rewards.