near / near-indexer-for-explorer

Watch NEAR network and store all the data from NEAR blockchain to PostgreSQL database
https://near-indexers.io/docs/projects/near-indexer-for-explorer
GNU General Public License v3.0
123 stars 56 forks source link

Duplicated rows in DB #74

Closed telezhnaya closed 3 years ago

telezhnaya commented 3 years ago

Try executing following query:

select * from account_changes 
where account_changes.changed_in_block_hash = 'CFAAJTVsw5y4GmMKNmuTNybxFJtapKcrarsTh5TPUyQf';

Expected: 5 rows Actual: 30 rows, that could be easily grouped to 5

Similar request to RPC:


POST to https://archival-rpc.mainnet.near.org
Body
{
  "jsonrpc": "2.0",
  "id": "dontcare",
  "method": "EXPERIMENTAL_changes",
  "params": {
    "changes_type": "account_changes",
    "account_ids": ["nfvalidator1.near", "nfvalidator2.near", "nfvalidator3.near", "nfvalidator4.near", "near", "erm.near", "treasury.near"],
     "block_id": 9820214
  }
}

Result gives 5 rows

Looks like we have an error in unique constraint, and we do not handle null values properly. See this article for more details: https://www.enterprisedb.com/postgres-tutorials/postgresql-unique-constraint-null-allowing-only-one-null

rozgo commented 3 years ago

Column id is required in account_changes table, but its not being set, nor auto-incremented.

frol commented 3 years ago

@rozgo the column is bigserial, so it is auto-incremented. Do you experience any issues on your end?

rozgo commented 3 years ago

Many variations of this issue trying to get the indexer working:

run --store-genesis --allow-missing-relations-in-first-blocks 1000 sync-from-latest`
Mar 23 18:25:05.898  INFO indexer_for_explorer: Adding/updating accounts from genesis...
Mar 23 18:25:05.898  INFO indexer_for_explorer: Adding/updating access keys from genesis...
Mar 23 18:25:05.900  INFO stats: Server listening at ed25519:...@0.0.0.0:24571
Mar 23 18:25:07.329 ERROR indexer_for_explorer: Error occurred while Accounts from genesis were being added to database. Retrying in 100 milliseconds... 
 Execute(
    DatabaseError(
        __Unknown,
        "null value in column \"id\" violates not-null constraint",
    ),
)

Similar errors with account_changes

frol commented 3 years ago

Ouch, indeed, I was looking into an old database schema, and now looking into the right migration scripts I see that we use bigint instead of bigserial there. Could you test the fix: #83? The original issue is not related to this issue with the id.

telezhnaya commented 3 years ago

This problem occurs because Postgres thinks that all nulls are unique, so unique constraint + null does not work properly. We need to change our unique constraint to several unique indexes.

https://stackoverflow.com/questions/8289100/create-unique-constraint-with-null-columns https://stackoverflow.com/questions/23542794/postgres-unique-constraint-vs-index

caused_by_transaction_hash is nullable, caused_by_receipt_id is nullable. We need to cover 4 cases (all combinations). The situation when both of them are not null is unreachable (it is logical, it looks so based on data), but if suddenly we will get this state - it is covered by the existing constraint, I decided not to delete it.

Another 3 indexes will be added as the migration.

telezhnaya commented 3 years ago

This issue will be finally fixed after we re-calc the DB Since I need to re-check that fix works fine, it's better to keep it opened

khorolets commented 3 years ago

@telezhnaya I believe you can finally close it and be proud of yourself :)