Closed Pitasi closed 2 years ago
How the code behaves when one table has multi column index?
How the code behaves when one table has multi column index?
The problem is exactly this one, I should have explained it better!
auth
table for example, has two secondary indexes:
root@localhost:26258/tracelistener> show indexes from auth;
table_name | index_name | non_unique | seq_in_index | column_name | direction | storing | implicit
-------------+--------------------------------------------+------------+--------------+----------------+-----------+---------+-----------
auth | auth_chain_name_address_account_number_key | false | 1 | chain_name | ASC | false | false
auth | auth_chain_name_address_account_number_key | false | 2 | address | ASC | false | false
auth | auth_chain_name_address_account_number_key | false | 3 | account_number | ASC | false | false
auth | auth_chain_name_address_account_number_key | false | 4 | id | ASC | false | true
auth | auth_chain_name_id_idx | true | 1 | chain_name | ASC | false | false
auth | auth_chain_name_id_idx | true | 2 | id | ASC | false | false
auth | primary | false | 1 | id | ASC | false | false
(7 rows)
When running resetchain
, the query it runs are something like this:
DELETE FROM auth WHERE (id <= 746973659877834754) AND (chain_name = 'cosmos-hub') ORDER BY id DESC LIMIT 5000 RETURNING id;
I would expect CockroachDB to use the auth_chain_name_id_idx
since it perfectly match my "where" clause.
Now let me analyze that query:
root@localhost:26258/tracelistener> explain DELETE FROM auth WHERE (id <= 746973659877834754) AND (chain_name = 'cosmos-hub') ORDER BY id DESC LIMIT 5000 RETURNING id;
info
---------------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• delete
│ estimated row count: 5,000
│ from: auth
│ auto commit
│
└── • limit
│ estimated row count: 5,000
│ count: 5000
│
└── • filter
│ estimated row count: 959,775
│ filter: chain_name = 'cosmos-hub'
│
└── • revscan
estimated row count: 3,685,569 (59% of the table; stats collected 18 minutes ago)
table: auth@primary
spans: [ - /746973659877834754]
as you can read from above, it initially does a revscan using auth@primary
- and that results in 3mln rows 🤯 After that it filters those 3mln by chain_name and apply the 5000 limit.
That means really bad performances while deleting (I've seen it myself).
Now let me show what happens when I use the "force index selection" feature (= I only appended @auth_chain_name_id_idx
to table name):
root@localhost:26258/tracelistener> explain DELETE FROM auth@auth_chain_name_id_idx WHERE (id <= 746973659877834754) AND (chain_name = 'cosmos-hub') ORDER BY id DESC LIMIT 5000 RETURNING id;
info
---------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• delete
│ estimated row count: 5,000
│ from: auth
│ auto commit
│
└── • index join
│ estimated row count: 5,000
│ table: auth@primary
│
└── • revscan
estimated row count: 5,000 (0.08% of the table; stats collected 23 minutes ago)
table: auth@auth_chain_name_id_idx
spans: [/'cosmos-hub' - /'cosmos-hub'/746973659877834754]
limit: 5000
This time the initial revscan can precisely match that 5.000 rows I need, without scanning ALL of the rows in the auth table 😅 performances = happy
Does this make sense?
BTW I have seen CockroachDB picking up the wrong index only for really large chains (osmosis and cosmos-hub) and only for the bigger tables (delegations, auth). Using the correct index make the resetchain
command exec time go from 30+ minutes to 2-3 minutes.
How the code behaves when one table has multi column index? ...
Thanks for the clarification 🔆
I also added a feature for selecting the tables to be reset (instead of all of them, which remains the default behaviour)
Say someone provided 9 correct index name and 1 incorrect index name. We don't want the program to panic and leave an intermediate state, which may lead to subtle bugs.
the program is intended to keep going and "delete all it can" so it will indeed delete 9 tables and at the end it will print a small report saying that it wasn't to delete X because index Y doesn't exist
Since we're using the table flag, I wonder, do we really need the forceIndexes flag.
It makes sense. If we proceed this way, I think of removing the default tables list altogether and let the caller decide which tables needs to be deleted.
@DeshErBojhaa I removed -force-index, I think less code is better (less logic -> easier to understand).
Also less code == easy review :p
@DeshErBojhaa can review this please? thanks!
I have seen that sometimes (on auth table) cockroachdb doesn't use the correct index, but it uses a more inefficient one. With this PR I add the ability to override that by passing a custom flag like
./resetchain -force-indexes auth@correct_index
.