IntersectMBO / cardano-db-sync

A component that follows the Cardano chain and stores blocks and transactions in PostgreSQL
Apache License 2.0
290 stars 161 forks source link

Missing indicies after restoring mainnet snapshot #1220

Open dostrelith678 opened 2 years ago

dostrelith678 commented 2 years ago

OS Your OS: Ubuntu 20.04.4 LTS (GNU/Linux 5.4.0-122-generic x86_64)

Versions The db-sync version (eg cardano-db-sync --version):

cardano-db-sync 13.0.2 - linux-x86_64 - ghc-8.10
git revision 8733b71af3ada773755c5cb08d9c061eda11fb96

PostgreSQL version: 14.4

Build/Install Method The method you use to build or install cardano-db-sync: Pre-compiled Hydra binary (https://hydra.iohk.io/build/17139466/download/1/cardano-db-sync-13.0.2-linux.tar.gz)

Run method The method you used to run cardano-db-sync (eg Nix/Docker/systemd/none): systemd

Additional context Add any other context about the problem here. After upgrading to db-sync v13 and restoring the mainnet snapshot, I noticed some of the queries we run at Koios are extremely slow.

Problem Report Please do not include screenshots or images, but instead cut and paste any relevant log messages or errors.

After analysing the affected queries, we noticed that Postgres is deciding to do a sequential scan instead of an index scan on the tx_out table. I looked at the tx_out table information of the restored snapshot:

cexplorer=# \d+ tx_out
                                                                   Table "public.tx_out"
       Column        |       Type        | Collation | Nullable |              Default               | Storage  | Compression | Stats target | Description
---------------------+-------------------+-----------+----------+------------------------------------+----------+-------------+--------------+-------------
 id                  | bigint            |           | not null | nextval('tx_out_id_seq'::regclass) | plain    |             |              |
 tx_id               | bigint            |           | not null |                                    | plain    |             |              |
 index               | txindex           |           | not null |                                    | plain    |             |              |
 address             | character varying |           | not null |                                    | extended |             |              |
 address_raw         | bytea             |           | not null |                                    | extended |             |              |
 address_has_script  | boolean           |           | not null |                                    | plain    |             |              |
 payment_cred        | hash28type        |           |          |                                    | extended |             |              |
 stake_address_id    | bigint            |           |          |                                    | plain    |             |              |
 value               | lovelace          |           | not null |                                    | main     |             |              |
 data_hash           | hash32type        |           |          |                                    | extended |             |              |
 inline_datum_id     | bigint            |           |          |                                    | plain    |             |              |
 reference_script_id | bigint            |           |          |                                    | plain    |             |              |
Indexes:
    "tx_out_pkey" PRIMARY KEY, btree (id)
    "idx_tx_out_stake_address_id" btree (stake_address_id)
    "tx_out_inline_datum_id_idx" btree (inline_datum_id)
    "tx_out_reference_script_id_idx" btree (reference_script_id)
    "unique_txout" UNIQUE CONSTRAINT, btree (tx_id, index)
Foreign-key constraints:
    "tx_out_inline_datum_id_fkey" FOREIGN KEY (inline_datum_id) REFERENCES datum(id) ON UPDATE RESTRICT ON DELETE CASCADE
    "tx_out_reference_script_id_fkey" FOREIGN KEY (reference_script_id) REFERENCES script(id) ON UPDATE RESTRICT ON DELETE CASCADE
    "tx_out_stake_address_id_fkey" FOREIGN KEY (stake_address_id) REFERENCES stake_address(id) ON UPDATE RESTRICT ON DELETE CASCADE
    "tx_out_tx_id_fkey" FOREIGN KEY (tx_id) REFERENCES tx(id) ON UPDATE RESTRICT ON DELETE CASCADE

Comparing this to a testnet instance of the same db-sync version:

cexplorer_testnet=# \d+ tx_out;
      |
 address_raw         | bytea             |           | not null |                                    | extended |             |              |
 address_has_script  | boolean           |           | not null |                                    | plain    |             |              |
 payment_cred        | hash28type        |           |          |                                    | extended |             |              |
 stake_address_id    | bigint            |           |          |                                    | plain    |             |              |
 value               | lovelace          |           | not null |                                    | main     |             |              |
 data_hash           | hash32type        |           |          |                                    | extended |             |              |
 inline_datum_id     | bigint            |           |          |                                    | plain    |             |              |
 reference_script_id | bigint            |           |          |                                    | plain    |             |              |
Indexes:
    "tx_out_pkey" PRIMARY KEY, btree (id)
    "idx_tx_out_address" hash (address)
    "idx_tx_out_payment_cred" btree (payment_cred)
    "idx_tx_out_stake_address_id" btree (stake_address_id)
    "idx_tx_out_tx_id" btree (tx_id)
    "tx_out_inline_datum_id_idx" btree (inline_datum_id)
    "tx_out_reference_script_id_idx" btree (reference_script_id)
    "unique_txout" UNIQUE CONSTRAINT, btree (tx_id, index)
Foreign-key constraints:
    "tx_out_inline_datum_id_fkey" FOREIGN KEY (inline_datum_id) REFERENCES datum(id) ON UPDATE RESTRICT ON DELETE CASCADE
    "tx_out_reference_script_id_fkey" FOREIGN KEY (reference_script_id) REFERENCES script(id) ON UPDATE RESTRICT ON DELETE CASCADE
    "tx_out_stake_address_id_fkey" FOREIGN KEY (stake_address_id) REFERENCES stake_address(id) ON UPDATE RESTRICT ON DELETE CASCADE
    "tx_out_tx_id_fkey" FOREIGN KEY (tx_id) REFERENCES tx(id) ON UPDATE RESTRICT ON DELETE CASCADE
Referenced by:
    TABLE "ma_tx_out" CONSTRAINT "ma_tx_out_tx_out_id_fkey" FOREIGN KEY (tx_out_id) REFERENCES tx_out(id) ON UPDATE RESTRICT ON DELETE CASCADE

There are 3 missing indices in this table: "idx_tx_out_address" hash (address) "idx_tx_out_payment_cred" btree (payment_cred) "idx_tx_out_tx_id" btree (tx_id) as well as the Referenced by ma_tx_out section.

I checked the ma_tx_out table as well. Mainnet:

cexplorer=# \d+ ma_tx_out;
                                                          Table "public.ma_tx_out"
  Column   |    Type    | Collation | Nullable |                Default                | Storage | Compression | Stats target | Description
-----------+------------+-----------+----------+---------------------------------------+---------+-------------+--------------+-------------
 id        | bigint     |           | not null | nextval('ma_tx_out_id_seq'::regclass) | plain   |             |              |
 quantity  | word64type |           | not null |                                       | main    |             |              |
 tx_out_id | bigint     |           | not null |                                       | plain   |             |              |
 ident     | bigint     |           | not null |                                       | plain   |             |              |
Indexes:
    "ma_tx_out_pkey" PRIMARY KEY, btree (id)
    "unique_ma_tx_out" UNIQUE CONSTRAINT, btree (ident, tx_out_id)
Foreign-key constraints:
    "ma_tx_out_ident_fkey" FOREIGN KEY (ident) REFERENCES multi_asset(id) ON UPDATE RESTRICT ON DELETE CASCADE

Testnet:

cexplorer_testnet=# \d+ ma_tx_out;
                                                          Table "public.ma_tx_out"
  Column   |    Type    | Collation | Nullable |                Default                | Storage | Compression | Stats target | Description
-----------+------------+-----------+----------+---------------------------------------+---------+-------------+--------------+-------------
 id        | bigint     |           | not null | nextval('ma_tx_out_id_seq'::regclass) | plain   |             |              |
 quantity  | word64type |           | not null |                                       | main    |             |              |
 tx_out_id | bigint     |           | not null |                                       | plain   |             |              |
 ident     | bigint     |           | not null |                                       | plain   |             |              |
Indexes:
    "ma_tx_out_pkey" PRIMARY KEY, btree (id)
    "idx_ma_tx_out_tx_out_id" btree (tx_out_id)
    "unique_ma_tx_out" UNIQUE CONSTRAINT, btree (ident, tx_out_id)
Foreign-key constraints:
    "ma_tx_out_ident_fkey" FOREIGN KEY (ident) REFERENCES multi_asset(id) ON UPDATE RESTRICT ON DELETE CASCADE
    "ma_tx_out_tx_out_id_fkey" FOREIGN KEY (tx_out_id) REFERENCES tx_out(id) ON UPDATE RESTRICT ON DELETE CASCADE

which also reveals a missing index on the mainnet db.

I have confirmed that at least one of these indices (and presumably all of them) are present in the migration files: https://github.com/input-output-hk/cardano-db-sync/blob/e0a84705f82643513d05fe5de2a954b4b8bf59c8/schema/migration-3-0003-20200702.sql#L15

I am not sure this is a bug with db-sync, but the snapshot I restored seems faulty. The snapshot I restored is currently still the latest one listed in: https://update-cardano-mainnet.iohk.io/cardano-db-sync/index.html#13/: db-sync-snapshot-schema-13-block-7581148-x86_64.tgz

dostrelith678 commented 2 years ago

I will try to restore the previous snapshot db-sync-snapshot-schema-13-block-7560650-x86_64.tgz to see if the issue remains the same.

dostrelith678 commented 2 years ago

I will try to restore the previous snapshot db-sync-snapshot-schema-13-block-7560650-x86_64.tgz to see if the issue remains the same.

This one seems to be okay.

\d+ tx_out
                                                                   Table "public.tx_out"
       Column        |       Type        | Collation | Nullable |              Default               | Storage  | Compression | Stats target | Description
---------------------+-------------------+-----------+----------+------------------------------------+----------+-------------+--------------+-------------
 id                  | bigint            |           | not null | nextval('tx_out_id_seq'::regclass) | plain    |             |              |
 tx_id               | bigint            |           | not null |                                    | plain    |             |              |
 index               | txindex           |           | not null |                                    | plain    |             |              |
 address             | character varying |           | not null |                                    | extended |             |              |
 address_raw         | bytea             |           | not null |                                    | extended |             |              |
 address_has_script  | boolean           |           | not null |                                    | plain    |             |              |
 payment_cred        | hash28type        |           |          |                                    | extended |             |              |
 stake_address_id    | bigint            |           |          |                                    | plain    |             |              |
 value               | lovelace          |           | not null |                                    | main     |             |              |
 data_hash           | hash32type        |           |          |                                    | extended |             |              |
 inline_datum_id     | bigint            |           |          |                                    | plain    |             |              |
 reference_script_id | bigint            |           |          |                                    | plain    |             |              |
Indexes:
    "tx_out_pkey" PRIMARY KEY, btree (id)
    "idx_tx_out_address" hash (address)
    "idx_tx_out_payment_cred" btree (payment_cred)
    "idx_tx_out_stake_address_id" btree (stake_address_id)
    "idx_tx_out_tx_id" btree (tx_id)
    "tx_out_inline_datum_id_idx" btree (inline_datum_id)
    "unique_txout" UNIQUE CONSTRAINT, btree (tx_id, index)
Foreign-key constraints:
    "tx_out_inline_datum_id_fkey" FOREIGN KEY (inline_datum_id) REFERENCES datum(id) ON UPDATE RESTRICT ON DELETE CASCADE
    "tx_out_reference_script_id_fkey" FOREIGN KEY (reference_script_id) REFERENCES script(id) ON UPDATE RESTRICT ON DELETE CASCADE
    "tx_out_stake_address_id_fkey" FOREIGN KEY (stake_address_id) REFERENCES stake_address(id) ON UPDATE RESTRICT ON DELETE CASCADE
    "tx_out_tx_id_fkey" FOREIGN KEY (tx_id) REFERENCES tx(id) ON UPDATE RESTRICT ON DELETE CASCADE
Referenced by:
    TABLE "ma_tx_out" CONSTRAINT "ma_tx_out_tx_out_id_fkey" FOREIGN KEY (tx_out_id) REFERENCES tx_out(id) ON UPDATE RESTRICT ON DELETE CASCADE