Closed andreabadesso closed 2 years ago
For me this is approved, I just have a question
To prevent having to do a complete re-sync of the database, which would take some days, we can use the current database to re-calculate the transactions count for every address we believe have incorrect values for the transactions column
Which are those addresses? They should be all addresses that were ever affected by a re-org, or any address that belong to an input/output of a voided transaction/block.
@pedroferreira1
We can extract the list of affected addresses by using the address_tx_history
and the tx_output
tables, I've updated the design with the query we can use to do that
I just ran it on the mainnet database and we have 4 addresses that need to be re-calculated
LGTM :+1:
Summary
We noticed that the number of transactions on the explorer token balances screen was displaying an incorrect amount of total transactions for the hathor token
Upon investigation, the conclusion was that during a re-org, the
transactions
column was being incorrectly re-calculated by using onlytx_outputs
, excluding thetx_inputs
that also had theaddresses
involved.Why?
The explorer service uses the new elasticsearch instance as a backend to fetch token information, including the number of transactions done by each address that owns a given token.
To fetch the information about the total number of addresses and the total number of transactions, it runs the following query:
It calculates the sum of the
transactions
column for all theaddress_balance
rows for the searchedtoken_id
for all balances that have more than 0 tokens.I checked the biggest addresses balances using the
v1a/thin_wallet/address_search?address=
API from the fullnode, using a locally synced fullnode (without the address blacklist) and found big divergences on the number of transactions for the largest miners on the list.Why?
This divergence indicates an error on the mechanism that calculates the
transactions
column of theaddress_balance
table.This column is calculated when a new transaction is received from the fullnode by the wallet-service daemon and then sent to the
onNewTxRequest
lambda on the wallet-serviceThe way we increment this column is by incrementing it on every transaction for every
address_token
row on this method while updating the address balancesI discarted the possibility of an error on this method as we have a good coverage on it and the error does not appear to be consistent on every
address_balance
row, I checked this by creating multiple transactions on a new token and it calculated the sum of transactions for my address correctly.Why?
Next, I investigated if we lost transactions for some reason during the sync, I did this by downloading all the transactions for an address that I knew had problems on the
transactions
column from the fullnode and made a script to check if they all existed on the wallet-service:This is the query I've used to get the transactions from the wallet-service database (the timestamp on the query is the timestamp of the latest transaction from the fullnode):
With this script, I've found a missing transaction. Checking the
transaction
table, I noticed that we had it stored correctlyThe reason for this is that I was using the
tx_output
table to query for the transaction list for a given address (as we don't have a column that indicates all addresses involved in a transaction on thetransaction
table).The problem with this approach is that we don't consider the
tx_inputs
, so if a transaction used an utxo from an address and didn't send it to the same address, we would not calculate it on the transactions count.Why?
I searched the code for every query that changes the
address_balance
table and found two other places that mutates thetransactions
column:handleReorg
andhandleVoided
Whenever a re-org or a voided transaction happens, we re-calculate the balances and the transaction count for every address involved. Using only the
tx_outputs
works fine for balance re-calculation as balances are the sum of the available utxos for a given (address
,token_id
) pair, but it will miss transactions that only have the address on the inputs listSo, finally, the reason why we had addresses with invalid transaction counts is because those addresses have had
re-orgs
or voided transactions in the past and their balances were re-calculated incorrectly.Solution
The proposed solution is implemented on this PR. The idea is to re-calculate the transactions count by doing the inverse of what the
updateAddressTablesWithTx
method is doing, instead of adding transactions to the rows affected by a transaction, we will subtract transactions depending on the number of transactions affected by a re-org.This is done by using the
address_tx_history
table, that is already being updated when a re-org happens with the affected transactions being marked asvoided
.Basically, the idea is to query a count of transactions that need to be subtracted from the
address_balance
transaction column. This is done using this query:This way, whenever a re-org or a voided transaction happens, we can simply subtract the affected transactions (that are already being handled) from the
transactions
row on theaddress_balance
table.More details about the fix impementation can be seen on the PR
Fix the current database
We currently have
address_balance
rows with invalid values for thetransaction
column, to fix those we would need a complete re-orgLong-term solution
The long-term solution would be to do a complete re-sync of the wallet-service database, this way we can be sure that the transactions count for every address is correctly stored
Short-term solution
To prevent having to do a complete re-sync of the database, which would take some days, we can use the current database to re-calculate the transactions count for every address we believe have incorrect values for the
transactions
columnSo, for every address we want to re-calculate, we can use the following query that will sum the number of transactions from the
tx_output
list and the number of transactions from the "synthetic"tx_input
list derived from thespent_by
column of thetx_outputs
:We can extract the list of all addresses that have ever been affected by a re-org or had a voided transaction by using the
address_tx_history
and thetx_output
table, this is the query:Notice: We need to stop the wallet-service sync before running those queries as the balances are currently being updated by new transactions arriving from the network
Reviewers: @pedroferreira1 @r4mmer