IntersectMBO / cardano-db-sync

A component that follows the Cardano chain and stores blocks and transactions in PostgreSQL
Apache License 2.0
290 stars 160 forks source link

The drep_hash table incorrectly adds drep_hashes that are not registered #1851

Closed perturbing closed 1 month ago

perturbing commented 1 month ago

OS Your OS: Linux

Versions The db-sync version (eg cardano-db-sync --version):13.5.0.2 PostgreSQL version:

Build/Install Method The method you use to build or install cardano-db-sync: not relevant

Run method The method you used to run cardano-db-sync (eg Nix/Docker/systemd/none): Not relevant

Additional context The drep_hash table also adds hashes which are not witnessed by a registration script. That is, if I delegate to a drep that does not exist, db-sync adds it to the table.

Problem Report

dbsync-mainnet=> select * from drep_hash where raw = '\xe5ab37261b3d63600d566564879370aea031ea3108b0a6bd8cef58aa';
 id  |                            raw                             |                           view                           | has_script 
-----+------------------------------------------------------------+----------------------------------------------------------+------------
 125 | \xe5ab37261b3d63600d566564879370aea031ea3108b0a6bd8cef58aa | drep1uk4nwfsm843kqr2kv4jg0yms46srr633pzc2d0vvaav25ncwj8g | f
   5 | \xe5ab37261b3d63600d566564879370aea031ea3108b0a6bd8cef58aa | drep1uk4nwfsm843kqr2kv4jg0yms46srr633pzc2d0vvaav25ncwj8g | t
(2 rows)

Note that this should not exist, as it is impossible to find a script and key that have the same hash. I know that the above hash originates from a script, and that this was registered in this tx (see the script hash).

I do know that due to improper communication by tooling to accommodate script based dreps, some delegations to this keyed based representation of this drep exist onchain, though they do nothing as that keyed drep representation can never register.

sgillespie commented 1 month ago

From memory, I believe we get these from the ledger state, and we don't currently have a good way to validate them.

perturbing commented 1 month ago

Can't we do the following?

SELECT EXISTS (
    SELECT 1
    FROM script
    WHERE hash = '\xe5ab37261b3d63600d566564879370aea031ea3108b0a6bd8cef54aa'
) AS has_script;

That is, for the drep hash that claims to have has_script, check that it indeed is witnessed once via the script table (as a registration for that drep impies the plutus v3 script was witnessed at least once).

perturbing commented 1 month ago

And indeed, this originates from the ledger, see https://github.com/IntersectMBO/cardano-ledger/issues/4598

perturbing commented 1 month ago

Even better, you can do this

dbsync-mainnet=> select * from drep_hash where raw = '\xe5ab37261b3d63600d566564879370aea031ea3108b0a6bd8cef58aa';
 id  |                            raw                             |                           view                           | has_script 
-----+------------------------------------------------------------+----------------------------------------------------------+------------
 125 | \xe5ab37261b3d63600d566564879370aea031ea3108b0a6bd8cef58aa | drep1uk4nwfsm843kqr2kv4jg0yms46srr633pzc2d0vvaav25ncwj8g | f
   5 | \xe5ab37261b3d63600d566564879370aea031ea3108b0a6bd8cef58aa | drep1uk4nwfsm843kqr2kv4jg0yms46srr633pzc2d0vvaav25ncwj8g | t
(2 rows)

dbsync-mainnet=> SELECT *                                                                                         
FROM drep_registration
where drep_hash_id=125;
 id | tx_id | cert_index | deposit | drep_hash_id | voting_anchor_id 
----+-------+------------+---------+--------------+------------------
(0 rows)

dbsync-mainnet=> SELECT * 
FROM drep_registration
where drep_hash_id=5;
 id  |  tx_id   | cert_index |  deposit  | drep_hash_id | voting_anchor_id 
-----+----------+------------+-----------+--------------+------------------
   4 | 95887510 |          0 | 500000000 |            5 |                 
  65 | 95905708 |          0 |           |            5 |               29
 223 | 95966152 |          0 |           |            5 |              108
 259 | 95987535 |          0 |           |            5 |              121
(4 rows)

To establish if the script/key even registered. @Crypto2099

rdlrt commented 1 month ago

Given how messy the CIP on governance has been, it's best to not have bech32 saved in dbsync at all (as CIP standard is not static forever, has changed already and will likely change again in future as part of CIP129) - bech32 encoding/decoding should be done by tool providers based on CIP they follow - to lookup based on hex and has_script in DB.

You can also use in-database postgres extensions if beeded - see example here

It does not make sense to rely on predominantly append-only sync system to keep up with [in]decisions.

perturbing commented 1 month ago

to lookup based on hex and has_script in DB

True, but only that the has_script might say true and false for the same hash. We want to prevent people from delegating to keys/scripts for which exist no witness. I think the above query achieves that, can you confirm?

rdlrt commented 1 month ago

to lookup based on hex and has_script in DB

True, but only that the has_script might say true and false for the same hash. We want to prevent people from delegating to keys/scripts for which exist no witness. I think the above query achieves that, can you confirm?

CIP-129 addresses this by having additional bit resemble presence/absence of script (you lookup on BOTH - has_script + hex representation). While if not using CIP-29, you should be using prefix drep_script instead. We work with both the CIPs as shown in link above - without using view field. Any given drep bech32 (0005 or 129 based) already contains whether script is involved.

perturbing commented 1 month ago

CIP-129 addresses this by having additional bit resemble presence/absence of script. While if not using CIP-29, you should be using prefix drep_script instead.

I am not talking about the CIPs, I just care that db-sync logs dreps that are not registered, and that wallets might use this info to falsely view a drep hash as being a key or script. A good wallet should check that given the user's request to delegate to a drep, that this is registered, as a delegation to a non-existing drep is allowed by the ledger (this is/was not the case).

rdlrt commented 1 month ago

Presence of an entry in drep_hash table has nothing to do with whether a drep is registered or not, that's a transient state that'd be decided by drep_registration and drep_deregistration (this is exactly same for pool or stake_account too). To look for an actively registered drep, you'd have to not only make a query that there are no entries (as above) , but also ensure the drep hasnt deregistered later on.

The entire confusion currently only comes due to presence of dbsync saving (albeit faulty) bech32 strings currently, which it doesnt need to.

perturbing commented 1 month ago

Presence of drep in drep_hash table has nothing to do with whether a drep is registered or not

We are saying the same thing now, but that is not the point I want to make. The presence of non-registered dreps in that table is a mistake as per this issue. Now I would like to align on how to how prevent improper usage of this info, by indeed using drep_registration/deregistration tables. And maybe db-sync should rename that table or make note of this issue somehow so that users do not trust the info in it so blindly, as they are doing now.

And note that @sgillespie said

and we don't currently have a good way to validate them.

To which I say, you have :)

rdlrt commented 1 month ago

And maybe db-sync should rename that table or make note of this issue somehow so that users do not trust the info in it so blindly, as they are doing now.

I dont know who is mixing drep_hash to see it as registered drep, but that's not how it works - not as per documentation , nor as per current standards of naming in dbsync. Changing naming would instead be a deviation making it inconsistent with pool_hash.

The presence of non-registered dreps in that table is a mistake as per this https://github.com/IntersectMBO/cardano-ledger/issues/4598.

When someone delegates to a drep , it's hex and has_script values are part of their transaction - and as such, it should be logged. That's exactly how I'd expect dbsync to work. If/when ledger does a check for registration, and if tx wont allow invalid registrations to be accepted by chain - it would automatically stop adding those entries in this table.

While ledger could/would prevent forming such tx - it is the duty of wallets to check registration status (they can use your query - but even there your query is incomplete as it needs to cater for cases where a drep has multiple registration/deregistration actions - you need to look at current status of registration) - which has nothing to do with drep_hash itself.

and we don't currently have a good way to validate them.

To which I say, you have :)

Again - I dont know the context @sgillespie means - but for drep_hash table entry , if there is a transaction that has a reference, it is an on-chain data, and there should always rightly be an entry in dbsync. The only part that's wrong is storing invalid value in view column (which they dont need to), or atleast have an option to be turned off as per #1835

rdlrt commented 1 month ago

image The documentation reference : here

perturbing commented 1 month ago

That table I missed, thanks for that extra context, people are indeed misusing it then :)

@sgillespie, feel free to close this issue.

Crypto2099 commented 1 month ago

FWIW there's a much easier query to run here :)

select dh.*
from drep_hash dh
inner join drep_registration dr on dr.drep_hash_id = dh.id
where dh.raw = '\xe5ab37261b3d63600d566564879370aea031ea3108b0a6bd8cef58aa'
group by dh.id
id raw view has_script
5 e5ab37261b3d63600d566564879370aea031ea3108b0a6bd8cef58aa drep1uk4nwfsm843kqr2kv4jg0yms46srr633pzc2d0vvaav25ncwj8g true

This will at least only show you dreps with a matching registration based solely on the hash. From there it would be up to tolling providers to figure out if the latest status of that dRep was registration or retirement.

kderme commented 1 month ago

Querying drep_distr may be even beter for the current state of dreps

Crypto2099 commented 1 month ago

Querying drep_distr may be even beter for the current state of dreps

This would only be available/populated after the first snapshot that a dReps exists and rely on someone having delegated to the dRep though? Doesn't necessarily indicate whether or not they "exist" on chain as officially registered

kderme commented 1 month ago

This would only be available/populated after the first snapshot that a dReps exists and rely on someone having delegated to the dRep though? Doesn't necessarily indicate whether or not they "exist" on chain as officially registered

That's right. It actually shows that the drep has active voting power.