cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
29.99k stars 3.79k forks source link

sql: UPDATE with index becomes slower when assignment not dependent on previous values #26848

Closed stutiredboy closed 5 years ago

stutiredboy commented 6 years ago

QUESTION

Discussion: https://forum.cockroachlabs.com/t/is-it-reasonable-update-noindex-faster-than-update-index/1713/6

  1. the CockroachDB version is 2.0.2, update no index about 4x faster than update index.
  2. sysbench version 1.0.14

How to reproduce:

  1. 5 cockroach nodes with ssd, 32 cpus/128 G Memory, with: --cache=32GiB --max-sql-memory=32GiB
  2. table schema ( created by sysbench prepare data)
    CREATE TABLE sbtest1 (
      id INTEGER NOT NULL,
      k INTEGER NOT NULL DEFAULT 0:::INT,
      c STRING(120) NOT NULL DEFAULT '':::STRING,
      pad STRING(60) NOT NULL DEFAULT '':::STRING,
      CONSTRAINT "primary" PRIMARY KEY (id ASC),
      INDEX k_1 (k ASC),
      FAMILY "primary" (id, k, c, pad)
    )
  3. prepare data with sysbench:

    /usr/bin/sysbench /usr/share/sysbench/oltp_read_write.lua --db-driver=pgsql --pgsql-host=x.x.x.x --pgsql-port=26257 --pgsql-user=cockroach --pgsql-db=cockroach --table-size=30000000 --tables=40 --threads=8 --auto_inc=off --report-interval=120 prepare

  4. sysbench command to run non index update:

    sysbench /usr/share/sysbench/oltp_update_non_index.lua --db-driver=pgsql --pgsql-host=x.x.x.x --pgsql-port=26257 --pgsql-user=cockroach --pgsql-db=cockroach --table-size=30000000 --tables=40 --threads=300 --report-interval=10 --time=600 --max-requests=0 --percentile=95 --skip-trx=on run

  5. example SHOW QUERIES for non index update:

    UPDATE sbtest13 SET c = $1 WHERE id = $2

  6. sysbench command to run index update:

    sysbench /usr/share/sysbench/oltp_update_index.lua --db-driver=pgsql --pgsql-host=x.x.x.x --pgsql-port=26257 --pgsql-user=cockroach --pgsql-db=cockroach --table-size=30000000 --tables=40 --threads=300 --report-interval=10 --time=600 --max-requests=0 --percentile=95 --skip-trx=on run

  7. example SHOW QUERIES for index update:

    UPDATE sbtest30 SET k = k + 1 WHERE id = $1

Suggestion From @knz, I have tried to change the update index to UPDATE sbtest* SET k = $1 WHERE id = $2,the $1 is integer 1 in the test, but the index update is now 5x slower.

differ for sysbench oltp_common.lua

261,262c269,270
<       "UPDATE sbtest%u SET k=k+1 WHERE id=?",
<       t.INT},
---
>       "UPDATE sbtest%u SET k=? WHERE id=?",
>       t.INT, t.INT},
456c464,465
<       param[tnum].index_updates[1]:set(get_id())
---
>       param[tnum].index_updates[1]:set(1)
>       param[tnum].index_updates[2]:set(get_id())

Thanks.

knz commented 6 years ago

@stutiredboy thank you for creating this issue.

To reiterate my comment from the forum: the fact that the update using an index is 3-4x slower than without an index is to be expected, and is fundamental to any SQL database. Therefore, there is not much we can do about this.

The reason why I requested to create an issue is the new test case at the bottom: that using a single assignment in UPDATE is actually making the query slower. This is unexpected. I am renaming the issue accordingly.

knz commented 6 years ago

cc @BramGruneir

stutiredboy commented 6 years ago

Something missed, because it will take a very long time to create secondary index, before the prepare, we have to move the secondary index creation before insert in oltp_common.lua

differ:

198a199,205
>    if sysbench.opt.create_secondary then
>       print(string.format("Creating a secondary index on 'sbtest%d'...",
>                           table_num))
>       con:query(string.format("CREATE INDEX k_%d ON sbtest%d(k)",
>                               table_num, table_num))
>    end
>
235,240d241
<    if sysbench.opt.create_secondary then
<       print(string.format("Creating a secondary index on 'sbtest%d'...",
<                           table_num))
<       con:query(string.format("CREATE INDEX k_%d ON sbtest%d(k)",
<                               table_num, table_num))
<    end
jordanlewis commented 5 years ago

We haven't gotten to this. If you still have an issue like it, please open a new one.