IntersectMBO / cardano-db-sync

A component that follows the Cardano chain and stores blocks and transactions in PostgreSQL
Apache License 2.0
294 stars 162 forks source link

`only_utxo` preset - discrepancies between the docs and the code #1788

Closed ArturWieczorek closed 3 months ago

ArturWieczorek commented 4 months ago

Docs: https://github.com/IntersectMBO/cardano-db-sync/pull/1644/files#diff-c2f1f64eeb5238c809d70b333d9987646e84982e0b133ffa4e0ca1ff86c78448

This description is a bit confusing: It presents Only UTxO preset config as equivalent to multi_asset: false and then later mentions : "When syncing is completed, it loads the whole UTxO set from the ledger to the tx_out and ma_tx_out tables. "

Only UTxO This is equivalent to setting: "tx_cbor": "disable", "tx_out": { "value": "bootstrap" }, "ledger": "ignore", "shelley": { "enable": false }, "metadata": { "enable": "false" }, "multi_asset": { "enable": "false" }, "plutus": { "enable": false }, "governance": "disable", "offchain_pool_data": "disable" "pool_stat": "disable"

Initially populates only a few tables, like block and tx. It maintains a ledger state but doesn't use any of its data. When syncing is completed, it loads the whole UTxO set from the ledger to the tx_out and ma_tx_out tables. After that db-sync can be restarted with ledger set to "disable" to continue syncing without maintaining the ledger

And here is the state after sync:

sancho_new_test_config_only_utxo=# select * from ma_tx_out;
 id | quantity | tx_out_id | ident 
----+----------+-----------+-------
(0 rows)

sancho_new_test_config_only_utxo=# select * from ma_tx_mint;
 id | quantity | tx_id | ident 
----+----------+-------+-------
(0 rows)

sancho_new_test_config_only_utxo=# select * from tx_in;
 id | tx_in_id | tx_out_id | tx_out_index | redeemer_id 
----+----------+-----------+--------------+-------------
(0 rows)

sancho_new_test_config_only_utxo=# select count(*) from tx_out;
 count 
-------
 71771
(1 row)

So my question is - should ma_tx_out be dropped from description in sentence:

When syncing is completed, it loads the whole UTxO set from the ledger to the tx_out and ma_tx_out tables.

or not. Which behaviour is correct, the one described in current docs or code ?

Additional Information:

0) cardano-db-sync built from here: https://github.com/IntersectMBO/cardano-db-sync/pull/1644 and run with node 9.0.0 rev: 2820a63dc934c6d5b5f450b6c2543b81c6476696

1) Config used:

...
  "RequiresNetworkMagic": "RequiresMagic",

  "insert_options": {
     "preset": "only_utxo"
  },
  "defaultBackends": [
    "KatipBK"
  ],
...

2) cardano-db-sync was fully synced - sanchonet:

[db-sync-node:Info:77] [2024-07-26 08:36:41.60 UTC] Starting epoch 406
[db-sync-node:Info:77] [2024-07-26 08:36:41.60 UTC] Insert Conway Block: epoch 406, slot 35078476, block 1752535, hash a1b60c760dcf2697da28e998c70da02ee37e58cf588682cef8255740a3ce217e
[db-sync-node:Info:77] [2024-07-26 08:36:42.67 UTC] Deleted 0 tx_out
[db-sync-node:Info:77] [2024-07-26 08:36:43.51 UTC] Insert Conway Block: epoch 406, slot 35129574, block 1755000, hash 2149680f367b415e6515609fa3be47d8625c45b1df78614b58ba12da5cf7245f
[db-sync-node:Info:77] [2024-07-26 08:36:44.77 UTC] Persistant SQL Statement Cache size is 37
[db-sync-node:Info:77] [2024-07-26 08:36:44.77 UTC] 
Cache Statistics:
  Stake Addresses: cache sizes: 0 and 0, hits: 0, misses: 0
  Pools: cache size: 0, hit rate: 0%, hits: 0, misses: 1756649
  Datums: cache capacity: 250000, cache size: 0, hits: 0, misses: 0
  Multi Assets: cache capacity: 250000, cache size: 0, hits: 0, misses: 0
  Previous Block: hit rate: 49%, hits: 1756648, misses: 1756649
  TxId: cache size: 11710, cache capacity: 300000, hits: 0, misses: 0
[db-sync-node:Info:77] [2024-07-26 08:36:44.77 UTC] Starting epoch 407
[db-sync-node:Info:77] [2024-07-26 08:36:44.78 UTC] Insert Conway Block: epoch 407, slot 35164816, block 1756649, hash e6e40b4667faac5f97986704e8cb98089b050d494accc327d564270b1c74c8ee
[db-sync-node:Info:77] [2024-07-26 08:36:45.76 UTC] Starting UTxO bootstrap migration
[db-sync-node:Info:77] [2024-07-26 08:36:45.76 UTC] Inserting 71825 tx_out as pages of 100000
[db-sync-node:Info:77] [2024-07-26 08:36:45.76 UTC] Bootstrap in progress 0.0%
[db-sync-node:Info:77] [2024-07-26 08:36:49.71 UTC] UTxO bootstrap migration done
[db-sync-node:Info:77] [2024-07-26 08:36:49.71 UTC] The table epoch_stake was given a new unique constraint called unique_epoch_stake
[db-sync-node:Info:77] [2024-07-26 08:36:49.71 UTC] The table reward was given a new unique constraint called unique_reward
[db-sync-node:Info:77] [2024-07-26 08:36:49.72 UTC] Running database migrations in mode Indexes
[db-sync-node:Info:77] [2024-07-26 08:36:49.72 UTC] Found maintenance_work_mem=1GB, max_parallel_maintenance_workers=6
[db-sync-node:Warning:77] [2024-07-26 08:36:49.72 UTC] Creating Indexes. This may require an extended period of time to perform. Setting a higher maintenance_work_mem from Postgres usually speeds up this process. These indexes are not used by db-sync but are meant for clients. If you want to skip some of these indexes, you can stop db-sync, delete or modify any migration-4-* files in the schema directory and restart it.
[db-sync-node:Info:77] [2024-07-26 08:36:51.47 UTC] Indexes were created
[db-sync-node:Info:77] [2024-07-26 08:36:51.53 UTC] Insert Conway Block: epoch 407, slot 35193900, block 1758029, hash f42f5e7a083e82d61866f859782a44e6bfbb745275c51b8b33826e021407ccc9
[db-sync-node:Info:77] [2024-07-26 08:36:51.53 UTC] Insert Conway Block: epoch 407, slot 35193914, block 1758030, hash 9ad010b02000a71974f55adf79f3d5280afaffe7052045978bd879720255b1f2
[db-sync-node:Info:77] [2024-07-26 08:36:51.54 UTC] Insert Conway Block: epoch 407, slot 35193922, block 1758031, hash e7da1ae7e851a0fd2b7670a12c0eb42454237a854f3c2338eae7fbb920fe4b41
[db-sync-node:Info:77] [2024-07-26 08:36:51.54 UTC] Insert Conway Block: epoch 407, slot 35193963, block 1758032, hash 524373dde357252b80f98bda27ff8b487cd3de4ae64cbb18d70bbc6ba1850531
[db-sync-node:Info:77] [2024-07-26 08:36:51.54 UTC] Insert Conway Block: epoch 407, slot 35194001, block 1758033, hash 73da988fc70d8290bac38a7f83654934554fd17cc4747ea9bd938beb130ad90e
kderme commented 4 months ago

Do we know if this work in latest releases? ma_tx_out shouldn't stay empty

ArturWieczorek commented 4 months ago

@kderme it looks like behaviour is the same for 13.3.0.0.

State at the beginning - on sanchonet:


sanchonet_13_3_0_0_utxo=# select count(*) from tx;
 count 
-------
  1019
(1 row)

sanchonet_13_3_0_0_utxo=# select * from tx_in;
 id | tx_in_id | tx_out_id | tx_out_index | redeemer_id 
----+----------+-----------+--------------+-------------
(0 rows)

sanchonet_13_3_0_0_utxo=# select * from tx_out;
 id | tx_id | index | address | address_has_script | payment_cred | stake_address_id | value | data_hash | inline_datum_id | reference_script_id | consumed_by_tx_id 
----+-------+-------+---------+--------------------+--------------+------------------+-------+-----------+-----------------+---------------------+-------------------
(0 rows)

sanchonet_13_3_0_0_utxo=# select * from ma_tx_mint;
 id | quantity | tx_id | ident 
----+----------+-------+-------
(0 rows)

sanchonet_13_3_0_0_utxo=# select * from ma_tx_out;
 id | quantity | tx_out_id | ident 
----+----------+-----------+-------
(0 rows)

State after it was synced: tx_out table was filled with data, ma_tx_out is still empty.

sanchonet_13_3_0_0_utxo=# select count(*) from tx;
 count 
-------
 11804
(1 row)

sanchonet_13_3_0_0_utxo=# select * from tx_in;
 id | tx_in_id | tx_out_id | tx_out_index | redeemer_id 
----+----------+-----------+--------------+-------------
(0 rows)

sanchonet_13_3_0_0_utxo=# select count(*) from tx_out;
 count 
-------
 71853
(1 row)

sanchonet_13_3_0_0_utxo=# select * from ma_tx_mint;
 id | quantity | tx_id | ident 
----+----------+-------+-------
(0 rows)

sanchonet_13_3_0_0_utxo=# select * from ma_tx_out;
 id | quantity | tx_out_id | ident 
----+----------+-----------+-------
(0 rows)
[db-sync-node:Info:76] [2024-07-26 12:08:08.57 UTC] Starting epoch 407
[db-sync-node:Info:76] [2024-07-26 12:08:08.57 UTC] Insert Conway Block: epoch 407, slot 35164816, block 1756649, hash e6e40b4667faac5f97986704e8cb98089b050d494accc327d564270b1c74c8ee
[db-sync-node:Info:76] [2024-07-26 12:08:09.64 UTC] Deleted 0 tx_out
[db-sync-node:Info:76] [2024-07-26 12:08:09.85 UTC] Starting UTxO bootstrap migration
[db-sync-node:Info:76] [2024-07-26 12:08:09.85 UTC] Inserting 71853 tx_out as pages of 100000
[db-sync-node:Info:76] [2024-07-26 12:08:09.85 UTC] Bootstrap in progress 0.0%
[db-sync-node:Info:76] [2024-07-26 12:08:14.01 UTC] UTxO bootstrap migration done
[db-sync-node:Info:76] [2024-07-26 12:08:14.01 UTC] The table epoch_stake was given a new unique constraint called unique_epoch_stake
[db-sync-node:Info:76] [2024-07-26 12:08:14.01 UTC] The table reward was given a new unique constraint called unique_reward
[db-sync-node:Info:76] [2024-07-26 12:08:14.02 UTC] Running database migrations in mode Indexes
[db-sync-node:Info:76] [2024-07-26 12:08:14.02 UTC] Found maintenance_work_mem=1GB, max_parallel_maintenance_workers=6
[db-sync-node:Warning:76] [2024-07-26 12:08:14.02 UTC] Creating Indexes. This may require an extended period of time to perform. Setting a higher maintenance_work_mem from Postgres usually speeds up this process. These indexes are not used by db-sync but are meant for clients. If you want to skip some of these indexes, you can stop db-sync, delete or modify any migration-4-* files in the schema directory and restart it.
[db-sync-node:Info:76] [2024-07-26 12:08:14.06 UTC] Indexes were created
[db-sync-node:Info:76] [2024-07-26 12:08:14.13 UTC] Insert Conway Block: epoch 407, slot 35206594, block 1758625, hash c4d5e2e9c5b4f79f317aeaaadd268d92dfdcc2941509c293f76b7873f3d8be53
[db-sync-node:Info:76] [2024-07-26 12:08:14.13 UTC] Insert Conway Block: epoch 407, slot 35206605, block 1758626, hash e42f69f580e59e92d8e333bb1a6f04869b0c9defb229ab0e603a88250f283d15
[db-sync-node:Info:76] [2024-07-26 12:08:14.13 UTC] Insert Conway Block: epoch 407, slot 35206648, block 1758627, hash c39283212c00943e7aee83bf9ff1240fd001d28106ebed44bfa04ea6088915d2
kderme commented 4 months ago

Hm are there any multiassets on sanchonet?

kderme commented 4 months ago

Hm are there any multiassets on sanchonet?

This line from the logs is an indication that there are none

Multi Assets: cache capacity: 250000, cache size: 0, hits: 0, misses: 0
ArturWieczorek commented 4 months ago
sancho_new_test_config_full=# select * from multi_asset;
 id |                           policy                           |                              name                              |                 fingerprint                  
----+------------------------------------------------------------+----------------------------------------------------------------+----------------------------------------------
  1 | \x3da414859ddd0955cf6694a20cbbc000d60c4cb49dcbf46ad543568a | \x37f9a73217d919ef97332211ade0afd42732e38b11577c82bbddfd9f2301 | asset16vajwx08w7wgwazxu9fe6lvvpxakgtxftmfc4w
  2 | \x3da414859ddd0955cf6694a20cbbc000d60c4cb49dcbf46ad543568a | \xe8a0e694ab11691ed96275dbb8e02c6ab7ae9af840e1e119830ae4e72300 | asset1uqx7htnvmw9az0qwvdhc5e06ep6qg535uwne2f
  3 | \x3da414859ddd0955cf6694a20cbbc000d60c4cb49dcbf46ad543568a | \x1c64222c787d10d13247d24b92563a28da34edcbdd918a1c280d02aa2300 | asset1jm3gee8mm3mmrep49c0qx7xmz0wugulx8fcs7l
  4 | \x3da414859ddd0955cf6694a20cbbc000d60c4cb49dcbf46ad543568a | \xeda822c2552732f57e749ed4d9c6056a0989884b12ac863301e83ecd2300 | asset129za0pu050u8ulhh0a08zdkwde8ued5cdgpcj2
  5 | \xc70ef82c7ff32b5dda84e3519d8d2b8cbd29c1d2a93cc2e183797143 | \x5e249e4b897fe25aea28119031cbb7931ccd4ae135caec5c90fabe3d2301 | asset146w8yfx0c5gue58v44nasvz7rx6c52wmapdprn
  6 | \xc70ef82c7ff32b5dda84e3519d8d2b8cbd29c1d2a93cc2e183797143 | \x5ea4fa27f12cda53293b37f721092eedad3f3221832c3a6c706e8d3a2300 | asset1enqrads9txxszt3kt4mx5sqwyq5ur8lne8yfak
  7 | \xe1acd94d234f91c27797639bca5e0a46f604a7d4a3b9ff353e929498 | \x54657374546f6b656e                                           | asset1jvcfm2w6tgy68n7ac7ck6qf7hn7a5n0wpwkf6k
  8 | \xc852933415a71b75a9750048be4bed695af8edb6117bcc233558575e | \x54657374546f6b656e                                           | asset1ljzxuyd7kwyegryxez2lxp93vy6qtt634r9ak2
  9 | \x186e32faa80a26810392fda6d559c7ed4721a65ce1c9d4ef3e1c87b4 | \x466f75726965725374617465546f6b656e                           | asset1kuw920egg86d82gftvm4mu54t4kmt3atxfnz53
 10 | \xb15a27b9f5f7b269f1615fb5629fe65cde97d1b8a281c82ba305cae1 | \x4d5346                                                       | asset154hqv5unkl3zs7yrlkr0j8zuqppuy5mnjywrqk
 11 | \xb15a27b9f5f7b269f1615fb5629fe65cde97d1b8a281c82ba305cae1 | \x4d534753                                                     | asset1hegev4dekcjyp04502xjc72lhw763nl76rq7kl
 12 | \x8dd4a839b8113fed3089e4b443227a78617e6db0d472185a2d8d2059 | \x4747                                                         | asset16tvttmdrg44anx9gm9ykkx9zjwnlxaaf5zaq32
(12 rows)
kderme commented 4 months ago

Still these multiasset may exist from minting or from outputs that have been spent. Are there tx_out entries that are not consumed (ie there is not inputs that reference them) and have a ma_tx_out.tx_out_id reference in full db-sync?

ArturWieczorek commented 4 months ago

If this query is correct - then there are some entries.

sancho_new_test_config_full=# SELECT
    tx_out.id AS tx_out_id,
    tx_out.address,
    tx_out.value,
    ma_tx_out.quantity,
    multi_asset.policy,
    multi_asset.name
FROM
    tx_out
LEFT JOIN
    tx_in
ON
    tx_out.id = tx_in.tx_out_id
JOIN
    ma_tx_out
ON
    tx_out.id = ma_tx_out.tx_out_id
JOIN
    multi_asset
ON
    ma_tx_out.ident = multi_asset.id
WHERE
    tx_in.tx_out_id IS NULL;

 tx_out_id |                                                   address                                                    |     value      | quantity |                           policy                           |                              name                              
-----------+--------------------------------------------------------------------------------------------------------------+----------------+----------+------------------------------------------------------------+----------------------------------------------------------------
     62460 | addr_test1wqp5gwxc8t5dlmfqv9cyj9zjf26g7848v3jemns9xvxzu8cjuart8                                              |        5000000 |        1 | \x3da414859ddd0955cf6694a20cbbc000d60c4cb49dcbf46ad543568a | \x37f9a73217d919ef97332211ade0afd42732e38b11577c82bbddfd9f2301
     62466 | addr_test1wzpswe7x8tj2sqneeq9tev2ldjlmuw9v4nrn3fepqsq5vggqlypy9                                              |       10000000 |        1 | \x3da414859ddd0955cf6694a20cbbc000d60c4cb49dcbf46ad543568a | \xe8a0e694ab11691ed96275dbb8e02c6ab7ae9af840e1e119830ae4e72300
     62469 | addr_test1wznaeghqgeqxw2rv302jczqnjd9fztgh9y963a24p24420cfqpdt3                                              |        5000000 |        1 | \x3da414859ddd0955cf6694a20cbbc000d60c4cb49dcbf46ad543568a | \x1c64222c787d10d13247d24b92563a28da34edcbdd918a1c280d02aa2300
     62472 | addr_test1wpknsnh38j7u4h4yssp27cr5x5nwusfatnvjmqcrnsl46xq3x3j6x                                              |        5000000 |        1 | \x3da414859ddd0955cf6694a20cbbc000d60c4cb49dcbf46ad543568a | \xeda822c2552732f57e749ed4d9c6056a0989884b12ac863301e83ecd2300
     62564 | addr_test1wz0dq8p4qvuenxg7zd5pd88uvqh6j7d3py9htjg8g0zvans4qq820                                              |        5000000 |        1 | \xc70ef82c7ff32b5dda84e3519d8d2b8cbd29c1d2a93cc2e183797143 | \x5e249e4b897fe25aea28119031cbb7931ccd4ae135caec5c90fabe3d2301
     62566 | addr_test1wqp5gwxc8t5dlmfqv9cyj9zjf26g7848v3jemns9xvxzu8cjuart8                                              |        5000000 |        1 | \x3da414859ddd0955cf6694a20cbbc000d60c4cb49dcbf46ad543568a | \x37f9a73217d919ef97332211ade0afd42732e38b11577c82bbddfd9f2301
...

sancho_new_test_config_full=# SELECT
    COUNT(*) AS unspent_tx_out_with_multi_assets
FROM
    tx_out
LEFT JOIN
    tx_in
ON
    tx_out.id = tx_in.tx_out_id
JOIN
    ma_tx_out
ON
    tx_out.id = ma_tx_out.tx_out_id
WHERE
    tx_in.tx_out_id IS NULL;
 unspent_tx_out_with_multi_assets 
----------------------------------
                             5047
(1 row)
ArturWieczorek commented 4 months ago

On preprod , after sync is completed:

psql preprod_13_3_0_0_utxo 
psql (14.12 (Ubuntu 14.12-0ubuntu0.22.04.1))
Type "help" for help.

preprod_13_3_0_0_utxo=# select * from multi_asset;
 id | policy | name | fingerprint 
----+--------+------+-------------
(0 rows)

preprod_13_3_0_0_utxo=# select * from ma_tx_out;
 id | quantity | tx_out_id | ident 
----+----------+-----------+-------
(0 rows)

preprod_13_3_0_0_utxo=# select * from ma_tx_mint;
 id | quantity | tx_id | ident 
----+----------+-------+-------
(0 rows)

preprod_13_3_0_0_utxo=# select count(*) from tx_out;
  count  
---------
 1438710
(1 row)