zanmato1984 / cura

CURA - CUDA Relational Algebra
Apache License 2.0
31 stars 10 forks source link

Enabling and confirming CURA with TiDB #9

Open ghost opened 3 years ago

ghost commented 3 years ago

Hi again,

I'm trying to make sure that TiDB can work with CURA. So far, I compiled the custom TiDB executable in the tidb_cura branch:

[2021/07/14 00:09:46.504 -06:00] [INFO] [printer.go:33] ["Welcome to TiDB."] ["Release Version"=v4.0.0-alpha-3820-g7d4c57b86] [Edition=Community] ["Git Commit Hash"=7d4c57b864b836bbd893484dc0a3347d9e7026e6] ["Git Branch"=tidb_cura] ["UTC Build Time"="2021-07-12 10:26:59"] [GoVersion=go1.16.5] ["Race Enabled"=false] ["Check Table Before Drop"=false] ["TiKV Min Version"=v3.0.0-60965b006877ca7234adaced7890d7b029ed1306]

And then, I made it go through a script like this:

export LD_LIBRARY_PATH="/home/yujinkim/cura/build:$LD_LIBRARY_PATH"
/home/yujinkim/tidb/tidb-cura/bin/tidb-server -config /home/yujinkim/tidb/tidb-cura/config/config.toml > bench/tidb-output.txt &

sleep 5s

echo Use Control-C after the queries had been run

## Enable CURA
mysql -v -h 127.0.0.1 -P 4000 -u root test -e "set GLOBAL tidb_enable_cura_exec = TRUE"

time tiup bench tpch cleanup
time tiup bench tpch prepare --queries q5,q9,q17,q18
time tiup bench tpch run --queries q5,q9,q17,q18 --time 0h1m0s

However, I'm not detecting any GPU activities. Is there a way I can get the GPU--CURA engine to work?

windtalker commented 3 years ago

Hi, can you show the explain results for the queries?

ghost commented 3 years ago

Hi,

This is an example for query 5:

-- using 1365545250 as a seed to the RNG

select
    n_name,
    sum(l_extendedprice * (1 - l_discount)) as revenue
from
    customer,
    orders,
    lineitem,
    supplier,
    nation,
    region
where
    c_custkey = o_custkey
    and l_orderkey = o_orderkey
    and l_suppkey = s_suppkey
    and c_nationkey = s_nationkey
    and s_nationkey = n_nationkey
    and n_regionkey = r_regionkey
    and r_name = 'MIDDLE EAST'
    and o_orderdate >= '1994-01-01'
    and o_orderdate < date_add('1994-01-01', interval '1' year)
group by
    n_name
order by
    revenue desc;

Plan:

+--------------------------------------------------------+------------+-----------+---------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                                                     | estRows    | task      | access object                                           | operator info                                                                                                                                                                                                                 |
+--------------------------------------------------------+------------+-----------+---------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Sort_24                                                | 1.00       | root      |                                                         | Column#49:desc                                                                                                                                                                                                                |
| └─Projection_26                                        | 1.00       | root      |                                                         | test.nation.n_name, Column#49                                                                                                                                                                                                 |
|   └─HashAgg_27                                         | 1.00       | root      |                                                         | group by:Column#57, funcs:sum(Column#55)->Column#49, funcs:firstrow(Column#56)->test.nation.n_name                                                                                                                            |
|     └─Projection_76                                    | 813638.72  | root      |                                                         | mul(test.lineitem.l_extendedprice, minus(1, test.lineitem.l_discount))->Column#55, test.nation.n_name, test.nation.n_name                                                                                                     |
|       └─Projection_28                                  | 813638.72  | root      |                                                         | test.lineitem.l_extendedprice, test.lineitem.l_discount, test.nation.n_name                                                                                                                                                   |
|         └─IndexHashJoin_34                             | 813638.72  | root      |                                                         | inner join, inner:IndexLookUp_31, outer key:test.orders.o_orderkey, inner key:test.lineitem.l_orderkey, equal cond:eq(test.orders.o_orderkey, test.lineitem.l_orderkey), eq(test.supplier.s_suppkey, test.lineitem.l_suppkey) |
|           ├─HashJoin_43(Build)                         | 205551.51  | root      |                                                         | inner join, equal:[eq(test.customer.c_custkey, test.orders.o_custkey)]                                                                                                                                                        |
|           │ ├─HashJoin_45(Build)                       | 15000.00   | root      |                                                         | inner join, equal:[eq(test.supplier.s_nationkey, test.customer.c_nationkey)]                                                                                                                                                  |
|           │ │ ├─HashJoin_47(Build)                     | 2.50       | root      |                                                         | inner join, equal:[eq(test.nation.n_nationkey, test.supplier.s_nationkey)]                                                                                                                                                    |
|           │ │ │ ├─HashJoin_60(Build)                   | 0.01       | root      |                                                         | inner join, equal:[eq(test.region.r_regionkey, test.nation.n_regionkey)]                                                                                                                                                      |
|           │ │ │ │ ├─TableReader_65(Build)              | 0.01       | root      |                                                         | data:Selection_64                                                                                                                                                                                                             |
|           │ │ │ │ │ └─Selection_64                     | 0.01       | cop[tikv] |                                                         | eq(test.region.r_name, "MIDDLE EAST")                                                                                                                                                                                         |
|           │ │ │ │ │   └─TableFullScan_63               | 5.00       | cop[tikv] | table:region                                            | keep order:false, stats:pseudo                                                                                                                                                                                                |
|           │ │ │ │ └─TableReader_62(Probe)              | 25.00      | root      |                                                         | data:TableFullScan_61                                                                                                                                                                                                         |
|           │ │ │ │   └─TableFullScan_61                 | 25.00      | cop[tikv] | table:nation                                            | keep order:false, stats:pseudo                                                                                                                                                                                                |
|           │ │ │ └─TableReader_67(Probe)                | 10000.00   | root      |                                                         | data:TableFullScan_66                                                                                                                                                                                                         |
|           │ │ │   └─TableFullScan_66                   | 10000.00   | cop[tikv] | table:supplier                                          | keep order:false                                                                                                                                                                                                              |
|           │ │ └─TableReader_69(Probe)                  | 150000.00  | root      |                                                         | data:TableFullScan_68                                                                                                                                                                                                         |
|           │ │   └─TableFullScan_68                     | 150000.00  | cop[tikv] | table:customer                                          | keep order:false                                                                                                                                                                                                              |
|           │ └─TableReader_72(Probe)                    | 210127.49  | root      |                                                         | data:Selection_71                                                                                                                                                                                                             |
|           │   └─Selection_71                           | 210127.49  | cop[tikv] |                                                         | ge(test.orders.o_orderdate, 1994-01-01 00:00:00.000000), lt(test.orders.o_orderdate, 1995-01-01)                                                                                                                              |
|           │     └─TableFullScan_70                     | 1555296.00 | cop[tikv] | table:orders                                            | keep order:false                                                                                                                                                                                                              |
|           └─IndexLookUp_31(Probe)                      | 3.96       | root      |                                                         |                                                                                                                                                                                                                               |
|             ├─IndexRangeScan_29(Build)                 | 3.96       | cop[tikv] | table:lineitem, index:PRIMARY(L_ORDERKEY, L_LINENUMBER) | range: decided by [eq(test.lineitem.l_orderkey, test.orders.o_orderkey)], keep order:false                                                                                                                                    |
|             └─TableRowIDScan_30(Probe)                 | 3.96       | cop[tikv] | table:lineitem                                          | keep order:false                                                                                                                                                                                                              |
+--------------------------------------------------------+------------+-----------+---------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
windtalker commented 3 years ago

I suggest you try a simple query like select count(*) from lineitem to see if the basic functionality works.

ghost commented 3 years ago

I'm getting:

+----------------------------+------------+-----------+---------------------------------------------------------+-----------------------------------+
| id                         | estRows    | task      | access object                                           | operator info                     |
+----------------------------+------------+-----------+---------------------------------------------------------+-----------------------------------+
| StreamAgg_20               | 1.00       | root      |                                                         | funcs:count(Column#26)->Column#18 |
| └─IndexReader_21           | 1.00       | root      |                                                         | index:StreamAgg_8                 |
|   └─StreamAgg_8            | 1.00       | cop[tikv] |                                                         | funcs:count(1)->Column#26         |
|     └─IndexFullScan_19     | 6614591.00 | cop[tikv] | table:lineitem, index:PRIMARY(L_ORDERKEY, L_LINENUMBER) | keep order:false                  |
+----------------------------+------------+-----------+---------------------------------------------------------+-----------------------------------+
4 rows in set (0.00 sec)
zanmato1984 commented 3 years ago

Hi Yujin, welcome back.

You can refer to https://github.com/zanmato1984/tidb-hackathon-2020/blob/master/sql/gpu_settings.sql for the necessary settings for tidb running through cura.

Please also be aware that our integration of cura into tidb is more like a demo rather than a product. Actually we only adapted operators/functions needed by q5, q9, q17, q18. If you run other queries, you'll very likely to see failures.

windtalker commented 3 years ago

By the way, if the table contains column with string type(char, varchar), you need to build a customized tikv to run cura query.(https://github.com/windtalker/tikv/tree/tikv_cura)

ghost commented 3 years ago

For some reason I am getting a "Type not supported" error in CURA. I've configured my setup like this:

pd:
yujinkim@istanbul:~/tidb/pd$ bin/pd-server -log-file log.log

tikv:
yujinkim@istanbul:~/tidb/tikv$ target/release/tikv-server -f log.log

tidb:
(cura-dev) yujinkim@istanbul:~/tidb/tidb$ bin/tidb-server -config config/config.toml

Where the config.toml was configured like so:

# TiDB server host.
host = "0.0.0.0"

# tidb server advertise IP.
advertise-address = ""

# TiDB server port.
port = 4000

# Registered store name, [tikv, mocktikv, unistore]
store = "tikv"

# TiDB storage path.
# path = "/tmp/tidb"
path = "127.0.0.1:2379"

# (and so on)

I think I'm using the right versions:

tikv:
[2021/07/15 09:05:30.239 -06:00] [INFO] [lib.rs:92] ["Welcome to TiKV"]
[2021/07/15 09:05:30.240 -06:00] [INFO] [lib.rs:97] ["Release Version:   5.0.0-rc.x"]
[2021/07/15 09:05:30.240 -06:00] [INFO] [lib.rs:97] ["Edition:           Community"]
[2021/07/15 09:05:30.240 -06:00] [INFO] [lib.rs:97] ["Git Commit Hash:   34bd27c1b37c9cf0d242f21608c427659d6d5d23"]
[2021/07/15 09:05:30.240 -06:00] [INFO] [lib.rs:97] ["Git Commit Branch: tikv_cura"]
[2021/07/15 09:05:30.240 -06:00] [INFO] [lib.rs:97] ["UTC Build Time:    2021-07-15 09:20:57"]
[2021/07/15 09:05:30.240 -06:00] [INFO] [lib.rs:97] ["Rust Version:      rustc 1.49.0-nightly (b1496c6e6 2020-10-18)"]
[2021/07/15 09:05:30.240 -06:00] [INFO] [lib.rs:97] ["Enable Features:   jemalloc mem-profiling portable sse protobuf-codec test-engines-rocksdb"]
[2021/07/15 09:05:30.240 -06:00] [INFO] [lib.rs:97] ["Profile:           release"]
[2021/07/15 09:05:30.240 -06:00] [INFO] [mod.rs:64] ["memory limit in bytes: 540363530240, cpu cores quota: 96"]
[2021/07/15 09:05:30.240 -06:00] [WARN] [lib.rs:528] ["environment variable `TZ` is missing, using `/etc/localtime`"]
[2021/07/15 09:05:30.240 -06:00] [WARN] [server.rs:1030] ["check: kernel"] [err="kernel parameters net.core.somaxconn got 4096, expect 32768"]
[2021/07/15 09:05:30.240 -06:00] [WARN] [server.rs:1030] ["check: kernel"] [err="kernel parameters net.ipv4.tcp_syncookies got 1, expect 0"]
[2021/07/15 09:05:30.240 -06:00] [WARN] [server.rs:1030] ["check: kernel"] [err="kernel parameters vm.swappiness got 60, expect 0"]
[2021/07/15 09:05:30.241 -06:00] [INFO] [util.rs:394] ["connecting to PD endpoint"] [endpoints=127.0.0.1:2379]
[2021/07/15 09:05:30.242 -06:00] [INFO] [<unknown>] ["TCP_USER_TIMEOUT is available. TCP_USER_TIMEOUT will be used thereafter"]
[2021/07/15 09:05:30.242 -06:00] [INFO] [<unknown>] ["New connected subchannel at 0x7fcebc848150 for subchannel 0x7fcec08d3380"]
[2021/07/15 09:05:30.243 -06:00] [INFO] [util.rs:394] ["connecting to PD endpoint"] [endpoints=http://127.0.0.1:2379]
[2021/07/15 09:05:30.244 -06:00] [INFO] [<unknown>] ["New connected subchannel at 0x7fcebc048150 for subchannel 0x7fcec08d31c0"]
[2021/07/15 09:05:30.245 -06:00] [INFO] [util.rs:394] ["connecting to PD endpoint"] [endpoints=http://127.0.0.1:2379]
[2021/07/15 09:05:30.245 -06:00] [INFO] [<unknown>] ["New connected subchannel at 0x7fcebb848150 for subchannel 0x7fcec08d3380"]
[2021/07/15 09:05:30.246 -06:00] [INFO] [util.rs:458] ["connected to PD leader"] [endpoints=http://127.0.0.1:2379]
[2021/07/15 09:05:30.246 -06:00] [INFO] [util.rs:382] ["all PD endpoints are consistent"] [endpoints="[\"127.0.0.1:2379\"]"]
[2021/07/15 09:05:30.247 -06:00] [INFO] [server.rs:292] ["connect to PD cluster"] [cluster_id=6985090340049161098]
[2021/07/15 09:05:30.247 -06:00] [INFO] [config.rs:1968] ["readpool.storage.use-unified-pool is not set, set to true by default"]
[2021/07/15 09:05:30.248 -06:00] [INFO] [config.rs:1991] ["readpool.coprocessor.use-unified-pool is not set, set to true by default"]
[2021/07/15 09:05:30.248 -06:00] [INFO] [config.rs:191] ["no advertise-addr is specified, falling back to default addr"] [addr=127.0.0.1:20160]
[2021/07/15 09:05:30.248 -06:00] [INFO] [config.rs:216] ["no advertise-status-addr is specified, falling back to status-addr"] [status-addr=127.0.0.1:20180]
[2021/07/15 09:05:30.249 -06:00] [INFO] [server.rs:1038] ["beginning system configuration check"]
[2021/07/15 09:05:30.249 -06:00] [INFO] [config.rs:772] ["data dir"] [mount_fs="FsInfo { tp: \"ext4\", opts: \"rw,relatime,stripe=64\", mnt_dir: \"/\", fsname: \"/dev/sda4\" }"] [data_path=./]
[2021/07/15 09:05:30.249 -06:00] [WARN] [config.rs:775] ["not on SSD device"] [data_path=./]
[2021/07/15 09:05:30.249 -06:00] [INFO] [config.rs:772] ["data dir"] [mount_fs="FsInfo { tp: \"ext4\", opts: \"rw,relatime,stripe=64\", mnt_dir: \"/\", fsname: \"/dev/sda4\" }"] [data_path=/home/yujinkim/tidb/tikv/raft]
[2021/07/15 09:05:30.249 -06:00] [WARN] [config.rs:775] ["not on SSD device"] [data_path=/home/yujinkim/tidb/tikv/raft]
[2021/07/15 09:05:30.249 -06:00] [INFO] [server.rs:261] ["using config"] [config="{\"log-level\":\"info\",\"log-file\":\"log.log\",\"log-format\":\"text\",\"slow-log-file\":\"\",\"slow-log-threshold\":\"1s\",\"log-rotation-timespan\":\"1d\",\"log-rotation-size\":\"300MiB\",\"panic-when-unexpected-key-or-data\":false,\"enable-io-snoop\":true,\"readpool\":{\"unified\":{\"min-thread-count\":1,\"max-thread-count\":76,\"stack-size\":\"10MiB\",\"max-tasks-per-worker\":2000},\"storage\":{\"use-unified-pool\":true,\"high-concurrency\":8,\"normal-concurrency\":8,\"low-concurrency\":8,\"max-tasks-per-worker-high\":2000,\"max-tasks-per-worker-normal\":2000,\"max-tasks-per-worker-low\":2000,\"stack-size\":\"10MiB\"},\"coprocessor\":{\"use-unified-pool\":true,\"high-concurrency\":76,\"normal-concurrency\":76,\"low-concurrency\":76,\"max-tasks-per-worker-high\":2000,\"max-tasks-per-worker-normal\":2000,\"max-tasks-per-worker-low\":2000,\"stack-size\":\"10MiB\"}},\"server\":{\"addr\":\"127.0.0.1:20160\",\"advertise-addr\":\"127.0.0.1:20160\",\"status-addr\":\"127.0.0.1:20180\",\"advertise-status-addr\":\"127.0.0.1:20180\",\"status-thread-pool-size\":1,\"max-grpc-send-msg-len\":10485760,\"grpc-compression-type\":\"none\",\"grpc-concurrency\":5,\"grpc-concurrent-stream\":1024,\"grpc-raft-conn-num\":1,\"grpc-memory-pool-quota\":9223372036854775807,\"grpc-stream-initial-window-size\":\"2MiB\",\"grpc-keepalive-time\":\"10s\",\"grpc-keepalive-timeout\":\"3s\",\"concurrent-send-snap-limit\":32,\"concurrent-recv-snap-limit\":32,\"end-point-recursion-limit\":1000,\"end-point-stream-channel-size\":8,\"end-point-batch-row-limit\":64,\"end-point-stream-batch-row-limit\":128,\"end-point-enable-batch-if-possible\":true,\"end-point-request-max-handle-duration\":\"1m\",\"end-point-max-concurrency\":96,\"snap-max-write-bytes-per-sec\":\"100MiB\",\"snap-max-total-size\":\"0KiB\",\"stats-concurrency\":1,\"heavy-load-threshold\":300,\"heavy-load-wait-duration\":\"1ms\",\"enable-request-batch\":true,\"background-thread-count\":3,\"end-point-slow-log-threshold\":\"1s\",\"labels\":{}},\"storage\":{\"data-dir\":\"./\",\"gc-ratio-threshold\":1.1,\"max-key-size\":4096,\"scheduler-concurrency\":524288,\"scheduler-worker-pool-size\":8,\"scheduler-pending-write-threshold\":\"100MiB\",\"reserve-space\":\"2GiB\",\"enable-async-apply-prewrite\":false,\"block-cache\":{\"shared\":true,\"capacity\":\"209203MiB\",\"num-shard-bits\":6,\"strict-capacity-limit\":false,\"high-pri-pool-ratio\":0.8,\"memory-allocator\":\"nodump\"}},\"pd\":{\"endpoints\":[\"127.0.0.1:2379\"],\"retry-interval\":\"300ms\",\"retry-max-count\":9223372036854775807,\"retry-log-every\":10,\"update-interval\":\"10m\"},\"metric\":{\"job\":\"tikv\"},\"raftstore\":{\"prevote\":true,\"raftdb-path\":\"/home/yujinkim/tidb/tikv/raft\",\"capacity\":\"0KiB\",\"raft-base-tick-interval\":\"1s\",\"raft-heartbeat-ticks\":2,\"raft-election-timeout-ticks\":10,\"raft-min-election-timeout-ticks\":10,\"raft-max-election-timeout-ticks\":20,\"raft-max-size-per-msg\":\"1MiB\",\"raft-max-inflight-msgs\":256,\"raft-entry-max-size\":\"8MiB\",\"raft-log-gc-tick-interval\":\"10s\",\"raft-log-gc-threshold\":50,\"raft-log-gc-count-limit\":73728,\"raft-log-gc-size-limit\":\"72MiB\",\"raft-log-reserve-max-ticks\":6,\"raft-engine-purge-interval\":\"10s\",\"raft-entry-cache-life-time\":\"30s\",\"raft-reject-transfer-leader-duration\":\"3s\",\"split-region-check-tick-interval\":\"10s\",\"region-split-check-diff\":\"6MiB\",\"region-compact-check-interval\":\"5m\",\"region-compact-check-step\":100,\"region-compact-min-tombstones\":10000,\"region-compact-tombstones-percent\":30,\"pd-heartbeat-tick-interval\":\"1m\",\"pd-store-heartbeat-tick-interval\":\"10s\",\"snap-mgr-gc-tick-interval\":\"1m\",\"snap-gc-timeout\":\"4h\",\"lock-cf-compact-interval\":\"10m\",\"lock-cf-compact-bytes-threshold\":\"256MiB\",\"notify-capacity\":40960,\"messages-per-tick\":4096,\"max-peer-down-duration\":\"5m\",\"max-leader-missing-duration\":\"2h\",\"abnormal-leader-missing-duration\":\"10m\",\"peer-stale-state-check-interval\":\"5m\",\"leader-transfer-max-log-lag\":10,\"snap-apply-batch-size\":\"10MiB\",\"consistency-check-interval\":\"0s\",\"report-region-flow-interval\":\"1m\",\"raft-store-max-leader-lease\":\"9s\",\"right-derive-when-split\":true,\"allow-remove-leader\":false,\"merge-max-log-gap\":10,\"merge-check-tick-interval\":\"10s\",\"use-delete-range\":false,\"cleanup-import-sst-interval\":\"10m\",\"local-read-batch-size\":1024,\"apply-max-batch-size\":256,\"apply-pool-size\":2,\"apply-reschedule-duration\":\"5s\",\"store-max-batch-size\":256,\"store-pool-size\":2,\"store-reschedule-duration\":\"5s\",\"future-poll-size\":1,\"hibernate-regions\":true,\"dev-assert\":false,\"apply-yield-duration\":\"500ms\",\"perf-level\":1},\"coprocessor\":{\"split-region-on-table\":false,\"batch-split-limit\":10,\"region-max-size\":\"144MiB\",\"region-split-size\":\"96MiB\",\"region-max-keys\":1440000,\"region-split-keys\":960000,\"consistency-check-method\":\"mvcc\",\"perf-level\":2},\"rocksdb\":{\"info-log-level\":\"info\",\"wal-recovery-mode\":2,\"wal-dir\":\"\",\"wal-ttl-seconds\":0,\"wal-size-limit\":\"0KiB\",\"max-total-wal-size\":\"4GiB\",\"max-background-jobs\":8,\"max-background-flushes\":2,\"max-manifest-file-size\":\"128MiB\",\"create-if-missing\":true,\"max-open-files\":40960,\"enable-statistics\":true,\"stats-dump-period\":\"10m\",\"compaction-readahead-size\":\"0KiB\",\"info-log-max-size\":\"1GiB\",\"info-log-roll-time\":\"0s\",\"info-log-keep-log-file-num\":10,\"info-log-dir\":\"\",\"rate-bytes-per-sec\":\"10GiB\",\"rate-limiter-refill-period\":\"100ms\",\"rate-limiter-mode\":2,\"rate-limiter-auto-tuned\":true,\"bytes-per-sync\":\"1MiB\",\"wal-bytes-per-sync\":\"512KiB\",\"max-sub-compactions\":3,\"writable-file-max-buffer-size\":\"1MiB\",\"use-direct-io-for-flush-and-compaction\":false,\"enable-pipelined-write\":true,\"enable-multi-batch-write\":true,\"enable-unordered-write\":false,\"defaultcf\":{\"block-size\":\"64KiB\",\"block-cache-size\":\"128832MiB\",\"disable-block-cache\":false,\"cache-index-and-filter-blocks\":true,\"pin-l0-filter-and-index-blocks\":true,\"use-bloom-filter\":true,\"optimize-filters-for-hits\":true,\"whole-key-filtering\":true,\"bloom-filter-bits-per-key\":10,\"block-based-bloom-filter\":false,\"read-amp-bytes-per-bit\":0,\"compression-per-level\":[\"no\",\"no\",\"lz4\",\"lz4\",\"lz4\",\"zstd\",\"zstd\"],\"write-buffer-size\":\"128MiB\",\"max-write-buffer-number\":5,\"min-write-buffer-number-to-merge\":1,\"max-bytes-for-level-base\":\"512MiB\",\"target-file-size-base\":\"8MiB\",\"level0-file-num-compaction-trigger\":4,\"level0-slowdown-writes-trigger\":20,\"level0-stop-writes-trigger\":36,\"max-compaction-bytes\":\"2GiB\",\"compaction-pri\":3,\"dynamic-level-bytes\":true,\"num-levels\":7,\"max-bytes-for-level-multiplier\":10,\"compaction-style\":0,\"disable-auto-compactions\":false,\"soft-pending-compaction-bytes-limit\":\"64GiB\",\"hard-pending-compaction-bytes-limit\":\"256GiB\",\"force-consistency-checks\":false,\"prop-size-index-distance\":4194304,\"prop-keys-index-distance\":40960,\"enable-doubly-skiplist\":true,\"enable-compaction-guard\":true,\"compaction-guard-min-output-file-size\":\"8MiB\",\"compaction-guard-max-output-file-size\":\"128MiB\",\"bottommost-level-compression\":\"zstd\",\"bottommost-zstd-compression-dict-size\":0,\"bottommost-zstd-compression-sample-size\":0,\"titan\":{\"min-blob-size\":\"1KiB\",\"blob-file-compression\":\"lz4\",\"blob-cache-size\":\"0KiB\",\"min-gc-batch-size\":\"16MiB\",\"max-gc-batch-size\":\"64MiB\",\"discardable-ratio\":0.5,\"sample-ratio\":0.1,\"merge-small-file-threshold\":\"8MiB\",\"blob-run-mode\":\"normal\",\"level-merge\":false,\"range-merge\":true,\"max-sorted-runs\":20,\"gc-merge-rewrite\":false}},\"writecf\":{\"block-size\":\"64KiB\",\"block-cache-size\":\"77299MiB\",\"disable-block-cache\":false,\"cache-index-and-filter-blocks\":true,\"pin-l0-filter-and-index-blocks\":true,\"use-bloom-filter\":true,\"optimize-filters-for-hits\":false,\"whole-key-filtering\":false,\"bloom-filter-bits-per-key\":10,\"block-based-bloom-filter\":false,\"read-amp-bytes-per-bit\":0,\"compression-per-level\":[\"no\",\"no\",\"lz4\",\"lz4\",\"lz4\",\"zstd\",\"zstd\"],\"write-buffer-size\":\"128MiB\",\"max-write-buffer-number\":5,\"min-write-buffer-number-to-merge\":1,\"max-bytes-for-level-base\":\"512MiB\",\"target-file-size-base\":\"8MiB\",\"level0-file-num-compaction-trigger\":4,\"level0-slowdown-writes-trigger\":20,\"level0-stop-writes-trigger\":36,\"max-compaction-bytes\":\"2GiB\",\"compaction-pri\":3,\"dynamic-level-bytes\":true,\"num-levels\":7,\"max-bytes-for-level-multiplier\":10,\"compaction-style\":0,\"disable-auto-compactions\":false,\"soft-pending-compaction-bytes-limit\":\"64GiB\",\"hard-pending-compaction-bytes-limit\":\"256GiB\",\"force-consistency-checks\":false,\"prop-size-index-distance\":4194304,\"prop-keys-index-distance\":40960,\"enable-doubly-skiplist\":true,\"enable-compaction-guard\":true,\"compaction-guard-min-output-file-size\":\"8MiB\",\"compaction-guard-max-output-file-size\":\"128MiB\",\"bottommost-level-compression\":\"zstd\",\"bottommost-zstd-compression-dict-size\":0,\"bottommost-zstd-compression-sample-size\":0,\"titan\":{\"min-blob-size\":\"1KiB\",\"blob-file-compression\":\"lz4\",\"blob-cache-size\":\"0KiB\",\"min-gc-batch-size\":\"16MiB\",\"max-gc-batch-size\":\"64MiB\",\"discardable-ratio\":0.5,\"sample-ratio\":0.1,\"merge-small-file-threshold\":\"8MiB\",\"blob-run-mode\":\"read-only\",\"level-merge\":false,\"range-merge\":true,\"max-sorted-runs\":20,\"gc-merge-rewrite\":false}},\"lockcf\":{\"block-size\":\"16KiB\",\"block-cache-size\":\"1GiB\",\"disable-block-cache\":false,\"cache-index-and-filter-blocks\":true,\"pin-l0-filter-and-index-blocks\":true,\"use-bloom-filter\":true,\"optimize-filters-for-hits\":false,\"whole-key-filtering\":true,\"bloom-filter-bits-per-key\":10,\"block-based-bloom-filter\":false,\"read-amp-bytes-per-bit\":0,\"compression-per-level\":[\"no\",\"no\",\"no\",\"no\",\"no\",\"no\",\"no\"],\"write-buffer-size\":\"32MiB\",\"max-write-buffer-number\":5,\"min-write-buffer-number-to-merge\":1,\"max-bytes-for-level-base\":\"128MiB\",\"target-file-size-base\":\"8MiB\",\"level0-file-num-compaction-trigger\":1,\"level0-slowdown-writes-trigger\":20,\"level0-stop-writes-trigger\":36,\"max-compaction-bytes\":\"2GiB\",\"compaction-pri\":0,\"dynamic-level-bytes\":true,\"num-levels\":7,\"max-bytes-for-level-multiplier\":10,\"compaction-style\":0,\"disable-auto-compactions\":false,\"soft-pending-compaction-bytes-limit\":\"64GiB\",\"hard-pending-compaction-bytes-limit\":\"256GiB\",\"force-consistency-checks\":false,\"prop-size-index-distance\":4194304,\"prop-keys-index-distance\":40960,\"enable-doubly-skiplist\":true,\"enable-compaction-guard\":false,\"compaction-guard-min-output-file-size\":\"8MiB\",\"compaction-guard-max-output-file-size\":\"128MiB\",\"bottommost-level-compression\":\"disable\",\"bottommost-zstd-compression-dict-size\":0,\"bottommost-zstd-compression-sample-size\":0,\"titan\":{\"min-blob-size\":\"1KiB\",\"blob-file-compression\":\"lz4\",\"blob-cache-size\":\"0KiB\",\"min-gc-batch-size\":\"16MiB\",\"max-gc-batch-size\":\"64MiB\",\"discardable-ratio\":0.5,\"sample-ratio\":0.1,\"merge-small-file-threshold\":\"8MiB\",\"blob-run-mode\":\"read-only\",\"level-merge\":false,\"range-merge\":true,\"max-sorted-runs\":20,\"gc-merge-rewrite\":false}},\"raftcf\":{\"block-size\":\"16KiB\",\"block-cache-size\":\"128MiB\",\"disable-block-cache\":false,\"cache-index-and-filter-blocks\":true,\"pin-l0-filter-and-index-blocks\":true,\"use-bloom-filter\":true,\"optimize-filters-for-hits\":true,\"whole-key-filtering\":true,\"bloom-filter-bits-per-key\":10,\"block-based-bloom-filter\":false,\"read-amp-bytes-per-bit\":0,\"compression-per-level\":[\"no\",\"no\",\"no\",\"no\",\"no\",\"no\",\"no\"],\"write-buffer-size\":\"128MiB\",\"max-write-buffer-number\":5,\"min-write-buffer-number-to-merge\":1,\"max-bytes-for-level-base\":\"128MiB\",\"target-file-size-base\":\"8MiB\",\"level0-file-num-compaction-trigger\":1,\"level0-slowdown-writes-trigger\":20,\"level0-stop-writes-trigger\":36,\"max-compaction-bytes\":\"2GiB\",\"compaction-pri\":0,\"dynamic-level-bytes\":true,\"num-levels\":7,\"max-bytes-for-level-multiplier\":10,\"compaction-style\":0,\"disable-auto-compactions\":false,\"soft-pending-compaction-bytes-limit\":\"64GiB\",\"hard-pending-compaction-bytes-limit\":\"256GiB\",\"force-consistency-checks\":false,\"prop-size-index-distance\":4194304,\"prop-keys-index-distance\":40960,\"enable-doubly-skiplist\":true,\"enable-compaction-guard\":false,\"compaction-guard-min-output-file-size\":\"8MiB\",\"compaction-guard-max-output-file-size\":\"128MiB\",\"bottommost-level-compression\":\"disable\",\"bottommost-zstd-compression-dict-size\":0,\"bottommost-zstd-compression-sample-size\":0,\"titan\":{\"min-blob-size\":\"1KiB\",\"blob-file-compression\":\"lz4\",\"blob-cache-size\":\"0KiB\",\"min-gc-batch-size\":\"16MiB\",\"max-gc-batch-size\":\"64MiB\",\"discardable-ratio\":0.5,\"sample-ratio\":0.1,\"merge-small-file-threshold\":\"8MiB\",\"blob-run-mode\":\"read-only\",\"level-merge\":false,\"range-merge\":true,\"max-sorted-runs\":20,\"gc-merge-rewrite\":false}},\"ver-defaultcf\":{\"block-size\":\"64KiB\",\"block-cache-size\":\"128832MiB\",\"disable-block-cache\":false,\"cache-index-and-filter-blocks\":true,\"pin-l0-filter-and-index-blocks\":true,\"use-bloom-filter\":true,\"optimize-filters-for-hits\":true,\"whole-key-filtering\":true,\"bloom-filter-bits-per-key\":10,\"block-based-bloom-filter\":false,\"read-amp-bytes-per-bit\":0,\"compression-per-level\":[\"no\",\"no\",\"lz4\",\"lz4\",\"lz4\",\"zstd\",\"zstd\"],\"write-buffer-size\":\"128MiB\",\"max-write-buffer-number\":5,\"min-write-buffer-number-to-merge\":1,\"max-bytes-for-level-base\":\"512MiB\",\"target-file-size-base\":\"8MiB\",\"level0-file-num-compaction-trigger\":4,\"level0-slowdown-writes-trigger\":20,\"level0-stop-writes-trigger\":36,\"max-compaction-bytes\":\"2GiB\",\"compaction-pri\":3,\"dynamic-level-bytes\":true,\"num-levels\":7,\"max-bytes-for-level-multiplier\":10,\"compaction-style\":0,\"disable-auto-compactions\":false,\"soft-pending-compaction-bytes-limit\":\"64GiB\",\"hard-pending-compaction-bytes-limit\":\"256GiB\",\"force-consistency-checks\":false,\"prop-size-index-distance\":4194304,\"prop-keys-index-distance\":40960,\"enable-doubly-skiplist\":true,\"enable-compaction-guard\":false,\"compaction-guard-min-output-file-size\":\"8MiB\",\"compaction-guard-max-output-file-size\":\"128MiB\",\"bottommost-level-compression\":\"zstd\",\"bottommost-zstd-compression-dict-size\":0,\"bottommost-zstd-compression-sample-size\":0,\"titan\":{\"min-blob-size\":\"1KiB\",\"blob-file-compression\":\"lz4\",\"blob-cache-size\":\"0KiB\",\"min-gc-batch-size\":\"16MiB\",\"max-gc-batch-size\":\"64MiB\",\"discardable-ratio\":0.5,\"sample-ratio\":0.1,\"merge-small-file-threshold\":\"8MiB\",\"blob-run-mode\":\"normal\",\"level-merge\":false,\"range-merge\":true,\"max-sorted-runs\":20,\"gc-merge-rewrite\":false}},\"titan\":{\"enabled\":false,\"dirname\":\"\",\"disable-gc\":false,\"max-background-gc\":4,\"purge-obsolete-files-period\":\"10s\"}},\"raftdb\":{\"wal-recovery-mode\":2,\"wal-dir\":\"\",\"wal-ttl-seconds\":0,\"wal-size-limit\":\"0KiB\",\"max-total-wal-size\":\"4GiB\",\"max-background-jobs\":4,\"max-background-flushes\":1,\"max-manifest-file-size\":\"20MiB\",\"create-if-missing\":true,\"max-open-files\":40960,\"enable-statistics\":true,\"stats-dump-period\":\"10m\",\"compaction-readahead-size\":\"0KiB\",\"info-log-max-size\":\"1GiB\",\"info-log-roll-time\":\"0s\",\"info-log-keep-log-file-num\":10,\"info-log-dir\":\"\",\"info-log-level\":\"info\",\"max-sub-compactions\":2,\"writable-file-max-buffer-size\":\"1MiB\",\"use-direct-io-for-flush-and-compaction\":false,\"enable-pipelined-write\":true,\"enable-unordered-write\":false,\"allow-concurrent-memtable-write\":true,\"bytes-per-sync\":\"1MiB\",\"wal-bytes-per-sync\":\"512KiB\",\"defaultcf\":{\"block-size\":\"64KiB\",\"block-cache-size\":\"2GiB\",\"disable-block-cache\":false,\"cache-index-and-filter-blocks\":true,\"pin-l0-filter-and-index-blocks\":true,\"use-bloom-filter\":false,\"optimize-filters-for-hits\":true,\"whole-key-filtering\":true,\"bloom-filter-bits-per-key\":10,\"block-based-bloom-filter\":false,\"read-amp-bytes-per-bit\":0,\"compression-per-level\":[\"no\",\"no\",\"lz4\",\"lz4\",\"lz4\",\"zstd\",\"zstd\"],\"write-buffer-size\":\"128MiB\",\"max-write-buffer-number\":5,\"min-write-buffer-number-to-merge\":1,\"max-bytes-for-level-base\":\"512MiB\",\"target-file-size-base\":\"8MiB\",\"level0-file-num-compaction-trigger\":4,\"level0-slowdown-writes-trigger\":20,\"level0-stop-writes-trigger\":36,\"max-compaction-bytes\":\"2GiB\",\"compaction-pri\":0,\"dynamic-level-bytes\":true,\"num-levels\":7,\"max-bytes-for-level-multiplier\":10,\"compaction-style\":0,\"disable-auto-compactions\":false,\"soft-pending-compaction-bytes-limit\":\"64GiB\",\"hard-pending-compaction-bytes-limit\":\"256GiB\",\"force-consistency-checks\":false,\"prop-size-index-distance\":4194304,\"prop-keys-index-distance\":40960,\"enable-doubly-skiplist\":true,\"enable-compaction-guard\":false,\"compaction-guard-min-output-file-size\":\"8MiB\",\"compaction-guard-max-output-file-size\":\"128MiB\",\"bottommost-level-compression\":\"disable\",\"bottommost-zstd-compression-dict-size\":0,\"bottommost-zstd-compression-sample-size\":0,\"titan\":{\"min-blob-size\":\"1KiB\",\"blob-file-compression\":\"lz4\",\"blob-cache-size\":\"0KiB\",\"min-gc-batch-size\":\"16MiB\",\"max-gc-batch-size\":\"64MiB\",\"discardable-ratio\":0.5,\"sample-ratio\":0.1,\"merge-small-file-threshold\":\"8MiB\",\"blob-run-mode\":\"normal\",\"level-merge\":false,\"range-merge\":true,\"max-sorted-runs\":20,\"gc-merge-rewrite\":false}},\"titan\":{\"enabled\":false,\"dirname\":\"\",\"disable-gc\":false,\"max-background-gc\":4,\"purge-obsolete-files-period\":\"10s\"}},\"raft-engine\":{\"enable\":false,\"dir\":\"/home/yujinkim/tidb/tikv/raft-engine\",\"recovery-mode\":\"tolerate-corrupted-tail-records\",\"bytes-per-sync\":\"256KiB\",\"target-file-size\":\"128MiB\",\"purge-threshold\":\"10GiB\",\"cache-limit\":\"1GiB\"},\"security\":{\"ca-path\":\"\",\"cert-path\":\"\",\"key-path\":\"\",\"cert-allowed-cn\":[],\"redact-info-log\":null,\"encryption\":{\"data-encryption-method\":\"plaintext\",\"data-key-rotation-period\":\"7d\",\"enable-file-dictionary-log\":true,\"file-dictionary-rewrite-threshold\":1000000,\"master-key\":{\"type\":\"plaintext\"},\"previous-master-key\":{\"type\":\"plaintext\"}}},\"import\":{\"num-threads\":8,\"stream-channel-window\":128,\"import-mode-timeout\":\"10m\"},\"backup\":{\"num-threads\":32,\"batch-size\":8},\"pessimistic-txn\":{\"wait-for-lock-timeout\":\"1s\",\"wake-up-delay-duration\":\"20ms\",\"pipelined\":true},\"gc\":{\"ratio-threshold\":1.1,\"batch-keys\":512,\"max-write-bytes-per-sec\":\"0KiB\",\"enable-compaction-filter\":false,\"compaction-filter-skip-version-check\":false},\"split\":{\"qps-threshold\":3000,\"split-balance-score\":0.25,\"split-contained-score\":0.5,\"detect-times\":10,\"sample-num\":20,\"sample-threshold\":100,\"size-threshold\":4194304,\"key-threshold\":40960},\"cdc\":{\"min-ts-interval\":\"1s\",\"old-value-cache-size\":1024,\"hibernate-regions-compatible\":true}}"]
[2021/07/15 09:05:30.251 -06:00] [ERROR] [server.rs:830] ["failed to init io snooper"] [err_code=KV:Unknown] [err="\"IO snooper is not started due to not compiling with BCC\""]
[2021/07/15 09:05:30.253 -06:00] [INFO] [mod.rs:115] ["encryption: none of key dictionary and file dictionary are found."]
[2021/07/15 09:05:30.253 -06:00] [INFO] [mod.rs:453] ["encryption is disabled."]
[2021/07/15 09:05:30.289 -06:00] [INFO] [future.rs:146] ["starting working thread"] [worker=gc-worker]
[2021/07/15 09:05:30.332 -06:00] [INFO] [mod.rs:205] ["Storage started."]

tidb:
[2021/07/15 09:14:24.345 -06:00] [INFO] [trackerRecorder.go:28] ["Mem Profile Tracker started"]
[2021/07/15 09:14:24.345 -06:00] [INFO] [printer.go:33] ["Welcome to TiDB."] ["Release Version"=v4.0.0-alpha-3820-g7d4c57b86] [Edition=Community] ["Git Commit Hash"=7d4c57b864b836bbd893484dc0a3347d9e7026e6] ["Git Branch"=tidb_cura] ["UTC Build Time"="2021-07-15 05:30:16"] [GoVersion=go1.16.5] ["Race Enabled"=false] ["Check Table Before Drop"=false] ["TiKV Min Version"=v3.0.0-60965b006877ca7234adaced7890d7b029ed1306]
[2021/07/15 09:14:24.347 -06:00] [INFO] [printer.go:47] ["loaded config"] [config="{\"host\":\"0.0.0.0\",\"advertise-address\":\"138.67.212.13\",\"port\":4000,\"cors\":\"\",\"store\":\"tikv\",\"path\":\"127.0.0.1:2379\",\"socket\":\"\",\"lease\":\"45s\",\"run-ddl\":true,\"split-table\":true,\"token-limit\":1000,\"oom-use-tmp-storage\":true,\"tmp-storage-path\":\"/tmp/1023_tidb/MC4wLjAuMDo0MDAwLzAuMC4wLjA6MTAwODA=/tmp-storage\",\"oom-action\":\"cancel\",\"mem-quota-query\":107374182400,\"tmp-storage-quota\":-1,\"enable-streaming\":false,\"enable-batch-dml\":false,\"lower-case-table-names\":2,\"server-version\":\"\",\"log\":{\"level\":\"info\",\"format\":\"text\",\"disable-timestamp\":null,\"enable-timestamp\":null,\"disable-error-stack\":null,\"enable-error-stack\":null,\"file\":{\"filename\":\"/home/yujinkim/tidb/tidb/log.log\",\"max-size\":300,\"max-days\":0,\"max-backups\":0},\"enable-slow-log\":true,\"slow-query-file\":\"tidb-slow.log\",\"slow-threshold\":300,\"expensive-threshold\":10000,\"query-log-max-len\":4096,\"record-plan-in-slow-log\":1},\"security\":{\"skip-grant-table\":false,\"ssl-ca\":\"\",\"ssl-cert\":\"\",\"ssl-key\":\"\",\"require-secure-transport\":false,\"cluster-ssl-ca\":\"\",\"cluster-ssl-cert\":\"\",\"cluster-ssl-key\":\"\",\"cluster-verify-cn\":null,\"spilled-file-encryption-method\":\"plaintext\"},\"status\":{\"status-host\":\"0.0.0.0\",\"metrics-addr\":\"\",\"status-port\":10080,\"metrics-interval\":15,\"report-status\":true,\"record-db-qps\":false},\"performance\":{\"max-procs\":0,\"max-memory\":0,\"server-memory-quota\":0,\"memory-usage-alarm-ratio\":0.8,\"stats-lease\":\"3s\",\"stmt-count-limit\":5000,\"feedback-probability\":0,\"query-feedback-limit\":512,\"pseudo-estimate-ratio\":0.8,\"force-priority\":\"NO_PRIORITY\",\"bind-info-lease\":\"3s\",\"txn-entry-size-limit\":6291456,\"txn-total-size-limit\":104857600,\"tcp-keep-alive\":true,\"cross-join\":true,\"run-auto-analyze\":true,\"agg-push-down-join\":false,\"committer-concurrency\":16,\"max-txn-ttl\":3600000,\"mem-profile-interval\":\"1m\",\"index-usage-sync-lease\":\"0s\",\"gogc\":100},\"prepared-plan-cache\":{\"enabled\":true,\"capacity\":100,\"memory-guard-ratio\":0.1},\"opentracing\":{\"enable\":false,\"rpc-metrics\":false,\"sampler\":{\"type\":\"const\",\"param\":1,\"sampling-server-url\":\"\",\"max-operations\":0,\"sampling-refresh-interval\":0},\"reporter\":{\"queue-size\":0,\"buffer-flush-interval\":0,\"log-spans\":false,\"local-agent-host-port\":\"\"}},\"proxy-protocol\":{\"networks\":\"\",\"header-timeout\":5},\"pd-client\":{\"pd-server-timeout\":3},\"tikv-client\":{\"grpc-connection-count\":4,\"grpc-keepalive-time\":10,\"grpc-keepalive-timeout\":3,\"grpc-compression-type\":\"none\",\"commit-timeout\":\"41s\",\"async-commit\":{\"keys-limit\":256,\"total-key-size-limit\":4096,\"SafeWindow\":2000000000,\"AllowedClockDrift\":500000000},\"max-batch-size\":128,\"overload-threshold\":200,\"max-batch-wait-time\":0,\"batch-wait-size\":8,\"enable-chunk-rpc\":true,\"region-cache-ttl\":600,\"store-limit\":0,\"store-liveness-timeout\":\"5s\",\"copr-cache\":{\"enable\":true,\"capacity-mb\":1000,\"admission-max-ranges\":500,\"admission-max-result-mb\":10,\"admission-min-process-ms\":5},\"ttl-refreshed-txn-size\":33554432},\"binlog\":{\"enable\":false,\"ignore-error\":false,\"write-timeout\":\"15s\",\"binlog-socket\":\"\",\"strategy\":\"range\"},\"compatible-kill-query\":false,\"plugin\":{\"dir\":\"\",\"load\":\"\"},\"pessimistic-txn\":{\"max-retry-count\":256},\"check-mb4-value-in-utf8\":true,\"max-index-length\":3072,\"index-limit\":64,\"table-column-count-limit\":1017,\"graceful-wait-before-shutdown\":0,\"alter-primary-key\":false,\"treat-old-version-utf8-as-utf8mb4\":true,\"enable-table-lock\":false,\"delay-clean-table-lock\":0,\"split-region-max-num\":1000,\"stmt-summary\":{\"enable\":true,\"enable-internal-query\":false,\"max-stmt-count\":200,\"max-sql-length\":4096,\"refresh-interval\":1800,\"history-size\":24},\"repair-mode\":false,\"repair-table-list\":[],\"isolation-read\":{\"engines\":[\"tikv\",\"tiflash\",\"tidb\"]},\"max-server-connections\":0,\"new_collations_enabled_on_first_bootstrap\":false,\"experimental\":{\"allow-expression-index\":false,\"enable-global-kill\":false},\"enable-collect-execution-info\":true,\"skip-register-to-dashboard\":false,\"enable-telemetry\":true,\"labels\":{},\"enable-global-index\":false,\"deprecate-integer-display-length\":false,\"txn-scope\":\"global\",\"enable-enum-length-limit\":true,\"stores-refresh-interval\":60,\"enable-tcp4-only\":false}"]
[2021/07/15 09:14:24.347 -06:00] [INFO] [main.go:316] ["disable Prometheus push client"]
[2021/07/15 09:14:24.347 -06:00] [INFO] [store.go:68] ["new store"] [path=tikv://127.0.0.1:2379]
[2021/07/15 09:14:24.347 -06:00] [INFO] [client.go:193] ["[pd] create pd client with endpoints"] [pd-address="[127.0.0.1:2379]"]
[2021/07/15 09:14:24.347 -06:00] [INFO] [systime_mon.go:25] ["start system time monitor"]
[2021/07/15 09:14:24.350 -06:00] [INFO] [base_client.go:308] ["[pd] switch leader"] [new-leader=http://127.0.0.1:2379] [old-leader=]
[2021/07/15 09:14:24.350 -06:00] [INFO] [base_client.go:112] ["[pd] init cluster id"] [cluster-id=6985090340049161098]
[2021/07/15 09:14:24.351 -06:00] [INFO] [store.go:74] ["new store with retry success"]
[2021/07/15 09:14:24.358 -06:00] [INFO] [tidb.go:71] ["new domain"] [store=tikv-6985090340049161098] ["ddl lease"=45s] ["stats lease"=3s] ["index usage sync lease"=0s]
[2021/07/15 09:14:24.364 -06:00] [INFO] [ddl.go:328] ["[ddl] start DDL"] [ID=fa5a91bb-c1bf-4215-b12f-2f78038ee661] [runWorker=true]
...

In the log files, the TiKV branch is set to tikv_cura, and the TiDB branch is set to tidb_cura.

I'm also detecting some interaction between TiDB and TiKV in my PD logs as well.

So what could I have been doing wrong?

zanmato1984 commented 3 years ago

Could you give a little more details about how you got "Type not supported" from cura? Maybe the query? The way that error was reported?

ghost commented 3 years ago

Hi,

I've pieced some things together like so:

set tidb_mem_quota_query = 343597383681;

set tidb_cura_chunk_size = 4 * 1024 * 1024;
set tidb_enable_cura_exec = 1;
set tidb_cura_support = 127;
set tidb_cura_concurrent_input_source = 0;

set tidb_cura_mem_res_type = arena;
set tidb_cura_mem_res_size = 7000 * 1024 * 1024;
set tidb_cura_exclusive_default_mem_res = 1;

set tidb_cura_stream_concurrency = 3;

set tidb_cura_enable_bucket_agg=0;
set tidb_cura_stream_concurrency=3;

select
  nation,
  o_year,
  sum(amount) as sum_profit
from
  (
    select
      n_name as nation,
      extract(year from o_orderdate) as o_year,
      l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
    from
      part,
      supplier,
      lineitem,
      partsupp,
      orders,
      nation
    where
      s_suppkey = l_suppkey
      and ps_suppkey = l_suppkey
      and ps_partkey = l_partkey
      and p_partkey = l_partkey
      and o_orderkey = l_orderkey
      and s_nationkey = n_nationkey
      and p_name like '%dim%'
  ) as profit
group by
  nation,
  o_year
order by
  nation,
  o_year desc;

And then named this file path-0001.sql. So the execution script is:

time tiup bench tpch cleanup
time tiup bench tpch prepare --queries q5,q9,q17,q18

## Do an example query
time mysql -v -h 127.0.0.1 -P 4000 -u root test < path-0001.sql

The command time allows me to see when each commanded ended and how much time was spent on it.

Here's the output:

yujinkim@istanbul:~/tidb/tidb-hackathon-2020$ ~/cura/regular-bench
--------------
set tidb_mem_quota_query = 343597383681
--------------

--------------
set tidb_cura_chunk_size = 4 * 1024 * 1024
--------------

--------------
set tidb_enable_cura_exec = 1
--------------

--------------
set tidb_cura_support = 127
--------------

--------------
set tidb_cura_concurrent_input_source = 0
--------------

--------------
set tidb_cura_mem_res_type = arena
--------------

--------------
set tidb_cura_mem_res_size = 7000 * 1024 * 1024
--------------

--------------
set tidb_cura_exclusive_default_mem_res = 1
--------------

--------------
set tidb_cura_stream_concurrency = 3
--------------

--------------
set tidb_cura_enable_bucket_agg=0
--------------

--------------
set tidb_cura_stream_concurrency=3
--------------

--------------
select
  nation,
  o_year,
  sum(amount) as sum_profit
from
  (
    select
      n_name as nation,
      extract(year from o_orderdate) as o_year,
      l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
    from
      part,
      supplier,
      lineitem,
      partsupp,
      orders,
      nation
    where
      s_suppkey = l_suppkey
      and ps_suppkey = l_suppkey
      and ps_partkey = l_partkey
      and p_partkey = l_partkey
      and o_orderkey = l_orderkey
      and s_nationkey = n_nationkey
      and p_name like '%dim%'
  ) as profit
group by
  nation,
  o_year
order by
  nation,
  o_year desc
--------------

ERROR 1105 (HY000) at line 17: Type not supported by Cura
Command exited with non-zero status 1
0.00user 0.01system 0:00.03elapsed 60%CPU (0avgtext+0avgdata 25172maxresident)k
0inputs+0outputs (0major+8612minor)pagefaults 0swaps

Line 17 in that file is the first select.

ghost commented 3 years ago

I don't know if this information might be useful but the tidb program requires libcura.so but the tikv program doesn't (for running).

windtalker commented 3 years ago

Type not supported by Cura means some types that are not supported by Cura, for example, Decimal is not supported by Cura now.

zanmato1984 commented 3 years ago

Sorry we forgot mentioning that cura doesn't support Decimal type yet (as neither cudf). Our demo of tpch replaced all Decimal columns with Double.

I think if you just want to see whatever query running on cura, you can try something like:

select count(*) from lineitem l join orders o on l.orderkey = o.orderkey

Or if you intentionally want to try tpch queries, you may have to recreate tpch tables manually with the Decimal type properly changed to Double, and use tiup to ingest data.

Last, tikv doesn't require cura libcura.so - cura takes care of the the rest computation after the portions pushed-down to tikv. This may also explain why you didn't see cura involved for query select count(*) from lineitem - most computation is pushed-down to tikv and the rest is so simple that doesn't requires cura.

ghost commented 3 years ago

Hi, I hit upon a crash due to a CURA module. See the log here: https://pastebin.com/Gm5yVt5D.

CURA failure at: /home/yujinkim/cura/src/driver/driver.cpp:52: Memory resource size per thread couldn't be 0
CURA failure at: /home/yujinkim/cura/src/driver/driver.cpp:62: Buckets of bucket aggregate couldn't be 0
CURA failure at: /home/yujinkim/cura/src/driver/driver.cpp:52: Memory resource size per thread couldn't be 0
CURA failure at: /home/yujinkim/cura/src/driver/driver.cpp:62: Buckets of bucket aggregate couldn't be 0
free(): invalid pointer
SIGABRT: abort

Query (based on q5):

set tidb_mem_quota_query = 343597383681;

set tidb_cura_chunk_size = 4 * 1024 * 1024;
set tidb_enable_cura_exec = 1;
set tidb_cura_support = 127;
set tidb_cura_concurrent_input_source = 0;

set tidb_cura_mem_res_type = arena;
set tidb_cura_mem_res_size = 7000 * 1024 * 1024;
set tidb_cura_exclusive_default_mem_res = 1;

set tidb_cura_enable_bucket_agg=0;
set tidb_cura_stream_concurrency=2;

select
  n_name,
  sum(l_extendedprice * (1 - l_discount)) as revenue
from
  customer,
  orders,
  lineitem,
  supplier,
  nation,
  region
where
  c_custkey = o_custkey
  and l_orderkey = o_orderkey
  and l_suppkey = s_suppkey
  and c_nationkey = s_nationkey
  and s_nationkey = n_nationkey
  and n_regionkey = r_regionkey
  and r_name = 'MIDDLE EAST'
  and o_orderdate >= '1994-01-01'
  and o_orderdate < date_add('1994-01-01', interval '1' year)
group by
  n_name
order by
  revenue desc;
ghost commented 3 years ago

If anything, I used this little script to change the column data types:

use test;

set GLOBAL tidb_enable_change_column_type = ON;

alter table part modify p_retailprice Double;
alter table supplier modify s_acctbal Double;
alter table partsupp modify ps_supplycost Double;
alter table customer modify c_acctbal Double;
alter table orders modify o_totalprice Double;
alter table lineitem modify l_quantity Double;
alter table lineitem modify l_extendedprice Double;
alter table lineitem modify l_discount Double;
alter table lineitem modify l_tax Double;
zanmato1984 commented 3 years ago
  1. Could you supply us your cura.log, which should be located under the same directory as tidb.log.
  2. Could you try query select count(*) from lineitem l join orders o on l.orderkey = o.orderkey and see if it works?
ghost commented 3 years ago

1.

Here's the cura log:

2021/07/27 02:24:08.913  [info] Cura executor running with json plan: 
{"rels": [
{"rel_op": "InputSource","source_id":1, "schema": [{"type": "FLOAT64", "nullable": true}]},
{"rel_op": "Project", "exprs": [{"binary_op": "DIV", "operands": [{"col_ref": 0},{"type":"FLOAT64", "literal": 7}], "type": {"type": "FLOAT64", "nullable": true}}]}
]}
2021/07/27 02:24:08.913  [info] Cura config: mem res type: 0, mem res size: 0, mem res size per thread 0, bucket agg true, bucket agg buckets 16, exclusive default mem res: true
2021/07/27 02:24:10.604  [info] CURA GPU Plan
Project(Column#0 / 7.000000)
|- InputSource(1, schema: [FLOAT64(NULLABLE)])
Pipeline#0(final):
InputSource#0(1) - Project#1(Column#0 / 7.000000)
2021/07/27 02:24:10.611  [info] cura threads per pipeline: 3
2021/07/27 02:24:10.611  [info] Cura executor running with json plan: 
{"rels": [
{"rel_op": "InputSource","source_id":1, "schema": [{"type": "INT64", "nullable": false},{"type": "STRING", "nullable": false},{"type": "STRING", "nullable": false}]},
{"rel_op": "InputSource","source_id":2, "schema": [{"type": "INT64", "nullable": false},{"type": "FLOAT64", "nullable": true},{"type": "FLOAT64", "nullable": true}]},
{"rel_op": "HashJoin","type": "INNER", "condition":{"binary_op": "EQUAL", "operands": [{"col_ref": 0},{"col_ref": 3}], "type": {"type": "INT64", "nullable": false}},"build_side": "LEFT"}, {"rel_op": "Project", "exprs": [{"col_ref": 0},{"col_ref": 4},{"col_ref": 5}]},
{"rel_op": "InputSource","source_id":3, "schema": [{"type": "INT64", "nullable": true},{"type": "FLOAT64", "nullable": true},{"type": "INT64", "nullable": false}]},
{"rel_op": "Aggregate", "groups": [{"col_ref": 2}], "aggs":[{"agg":"SUM", "operands":[{"col_ref": 0}],"type":{"type": "INT64", "nullable": true}},{"agg":"SUM", "operands":[{"col_ref": 1}],"type":{"type": "INT64", "nullable": true}}]}, {"rel_op": "Project", "exprs": [{"binary_op": "DIV", "operands": [{"col_ref": 2},{"col_ref": 1}], "type": {"type": "FLOAT64", "nullable": true}},{"col_ref": 0}]},
{"rel_op": "HashJoin","type": "INNER", "condition":{"binary_op": "EQUAL", "operands": [{"col_ref": 0},{"col_ref": 4}], "type": {"type": "INT64", "nullable": false}},"build_side": "LEFT"}, {"rel_op": "Filter", "condition": {"binary_op": "LESS", "operands": [{"col_ref": 1},{"binary_op": "MUL", "operands": [{"type":"FLOAT64", "literal": 0.2},{"col_ref": 3}], "type": {"type": "FLOAT64", "nullable": true}}], "type": {"type": "BOOL8", "nullable": true}}}, {"rel_op": "Project", "exprs": [{"col_ref": 2}]}
]}
2021/07/27 02:24:10.611  [info] Cura config: mem res type: 0, mem res size: 0, mem res size per thread 0, bucket agg true, bucket agg buckets 16, exclusive default mem res: true
2021/07/27 02:24:10.617  [info] CURA GPU Plan
Project(Column#2)
|- Filter(Column#1 < 0.200000 * Column#3)
   |- HashJoin(INNER, build_side: LEFT, cond: Column#0 = Column#4)
      |- Project(Column#0, Column#4, Column#5)
      |  |- HashJoin(INNER, build_side: LEFT, cond: Column#0 = Column#3)
      |     |- InputSource(1, schema: [INT64, STRING, STRING])
      |     |- InputSource(2, schema: [INT64, FLOAT64(NULLABLE), FLOAT64(NULLABLE)])
      |- Project(Column#2 / Column#1, Column#0)
         |- Aggregate(groups: [Column#2], aggregates: [SUM(Column#0), SUM(Column#1)])
            |- InputSource(3, schema: [INT64(NULLABLE), FLOAT64(NULLABLE), INT64])
Pipeline#0:
InputSource#5(1) - HashJoinBuild#6(keys: [0]) - Terminal#13
Pipeline#1:
InputSource#0(3) - BucketAggregate#1(keys: [2], aggregations: [SUM(0), SUM(1)]) ------------------------------------------------------------------------------- Terminal#14
InputSource#4(2) - HashJoinProbe#7(type: INNER, build: 6, build_side: LEFT, keys: [0]) - Project#8(Column#0, Column#4, Column#5) - HashJoinBuild#9(keys: [0]) - |
Pipeline#2(final):
HeapSource#2(1) - Project#3(Column#2 / Column#1, Column#0) - HashJoinProbe#10(type: INNER, build: 9, build_side: LEFT, keys: [1]) - Filter#11(Column#1 < 0.200000 * Column#3) - Project#12(Column#2)
2021/07/27 02:24:10.633  [info] cura threads per pipeline: 3
2021/07/27 02:24:10.633  [info] pipeline 0 source 1 push thread elapsed: 351.893µs, rows: 0, mem: 0
2021/07/27 02:24:10.633  [info] pipeline 0 source 1 push thread elapsed: 389.573µs, rows: 0, mem: 0
2021/07/27 02:24:10.636  [info] pipeline 0 source 1 push elapsed 2.370729ms, rows 233, mem 11936
2021/07/27 02:24:10.636  [info] pipeline 0 source 1 push thread elapsed: 2.890713ms, rows: 233, mem: 11936
2021/07/27 02:24:10.636  [info] pipeline 0 source 1 push total elapsed 2.995164ms
2021/07/27 02:24:10.641  [info] pipeline 0 running elapsed: 7.60175ms

Right here, as the server goes down, it abruptly ends.

The server says now the error is a "bus error." It's similar to the invalid free() call in that it's due to an invalid memory access (e.g. misalignment). See: https://gist.github.com/csm-yujinkim/15dfbe58451d161506c7511093f5581d

2.

select count(*) from lineitem l join orders o on l_orderkey = o_orderkey;

... works (I prepared this database using tiup bench tpch prepare). It gives me 6001215.

Also, I do detect some GPU usage.

Capture

Here, we see that two GPU processes have spawned.

ghost commented 3 years ago

Also, sorry for the very late reply. I have had some hard business to sort out...

zanmato1984 commented 3 years ago

Hi Yujin, thanks for sticking with us.

  1. For some reason, the tidb branch tidb_cura is not the one we ran our demo. We may have missed some fixes. I uploaded the binary used in our demo show, https://drive.google.com/file/d/1_uvnUG_U_HCacHh4xcg-s2o9iXVl35cg/view?usp=drivesdk built on x86-64, Ubuntu 18.04, go 1.15.2. You can try it in your local.
  2. Make sure your tables are analyzed, by running analyze table <TABLE_NAME>.
  3. Could you give us the explain result of q5 and q17?
ghost commented 3 years ago

Sure, I’ll try that when I come back.

On Jul 30, 2021, at 2:55 AM, ruoxi @.***> wrote:

Hi Yujin, thanks for sticking with us.

For some reason, the tidb branch tidb_cura is not the one we ran our demo. We may have missed some fixes. I uploaded the binary used in our demo show, built on x86-64, Ubuntu 18.04, go 1.15.2. You can try it in your local. Make sure your tables are analyzed, by running analyze table . Could you give us the explain result of q5 and q17? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/zanmato1984/cura/issues/9#issuecomment-889780784, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOLJVS25WQEHVYIPMK62ZNDT2JZK3ANCNFSM5AKYPYXQ.

ghost commented 3 years ago

I still have some problems. For some reason, I'm getting this type conversion problem:

Unsupported modify column: type double not match origin decimal(15,2)

This is caused whenever I issue a command like

alter table part modify p_retailprice Double;

The variable tidb_enable_change_column_type is already ON, and the same error happens for both types of tidb (the binary we used and the one you provided us). Why would this be the case?

zanmato1984 commented 3 years ago

I think in tidb 4.0, arbitrary column type modification is an experimental feature, meaning it may not be fully implemented or is problematic.

So I suggest pre-creating tables with decimal types replaced with double, then using tiup to import data (yes tiup will skip table creating if it sees tables already exist). You can refer to this script, but remember do replace decimal with double: https://github.com/pingcap/tidb-bench/blob/master/tpch/dss.sql

ghost commented 3 years ago

Yeah, I did that.

I used mysqldump to make a copy of the uploaded database, and then replaced the prepare stage with flashing TiDB with the cached database.

However the problem was that, somehow, DECIMAL(15, 2) was "too much" for a DOUBLE datatype to handle. I was able to do an intermediary conversion from DECIMAL(15, 2) to DECIMAL(8, 2), and then covert that to DOUBLE, and it worked.

For now, the only problem is the frequent crashing. In fact it only went through the whole benchmark only once which is a curiosity.

ghost commented 3 years ago

So I suggest pre-creating tables with decimal types replaced with double, then using tiup to import data (yes tiup will skip table creating if it sees tables already exist). You can refer to this script, but remember do replace decimal with double: https://github.com/pingcap/tidb-bench/blob/master/tpch/dss.sql

I might try that but since the type conversion is already solved I'll set that solution aside. Thank you for you suggestion, however.

zanmato1984 commented 3 years ago

Yeah, I did that.

I used mysqldump to make a copy of the uploaded database, and then replaced the prepare stage with flashing TiDB with the cached database.

However the problem was that, somehow, DECIMAL(15, 2) was "too much" for a DOUBLE datatype to handle. I was able to do an intermediary conversion from DECIMAL(15, 2) to DECIMAL(8, 2), and then covert that to DOUBLE, and it worked.

For now, the only problem is the frequent crashing. In fact it only went through the whole benchmark only once which is a curiosity.

  1. So for now the 4 queries did run without crashing for once, meaning the crashing is random right?
  2. My binary didn't help about the crashing, right?
  3. Could you do the following experiments for us? 3.1 After tidb starts, open a mysql client. 3.2 explain each query several times (e.g. 4 or 5 times), see if the plan changed. Whether it changes or not, please paste the explain result before and after the change, if any. I know this sounds crazy but we did observe that tidb gave unreasonable plan for a period after it just got boot up. 3.3 If you did see a plan change, run each query after the plan change and see if it crashes. Please give us tidb.log and cura.log if it crashed.

Thanks.

zanmato1984 commented 3 years ago

Meanwhile we'll try to reproduce your issue in our local. Please give us some time. Thanks.

ghost commented 3 years ago

Hi,

The scripts that we used are hosted in this repository:

https://github.com/csm-yujinkim/tidb-experiments

To use this, first:

# Check the repo out
git clone https://github.com/csm-yujinkim/tidb-experiments

# Initialize all the submodules
git submodule init
git submodule update

# Build things from source
./build.bash # takes 5-20 minutes

# If needed, close the shell and open again

# If needed, extract the custom version of `tidb-server` at the project root.

# Benchmark
./bench.bash 1 # scale factor ... ignored for now
./set-logs-aside.bash # ./bench.bash will call this script automatically; creates directory called 'cached-logs' that has previous logs checked by timestamps
./clean.bash # WIPE /tmp/*tidb*, and any accidental data spillage. ./bench.bash will call this script in the beginning as part of its operation; invoke manually if needed
ghost commented 3 years ago

I'm using the newc0 branch so that is to be noted

ghost commented 3 years ago

Hi,

I've uploaded some logs onto the repository I mentioned. Looks like there's a problem (bug?) with the pipeline_stream module in api.cpp?

For example, see https://github.com/csm-yujinkim/tidb-experiments/issues/11. The titles for the issues are UNIX timestamps.

zanmato1984 commented 3 years ago

Hi Yujin,

We have some findings:

  1. The very first crash, i.e. free(): invalid pointer, was caused by the defect in tidb branch https://github.com/csm-yujinkim/tidb/tree/tidb_cura. Using the tidb binary I gave you later should fix it.
  2. The later crashes, i.e. pipeline_stream, might be caused by that you were using a wrong tikv. Please point your tikv to this branch https://github.com/csm-yujinkim/tikv/tree/tikv_cura.

Would you try again after you fix bullet 2 above?

Thanks.

ghost commented 3 years ago

Hi,

Thanks for the correspondence.

  1. I replaced the TiDB binary with the one you gave me.
  2. I also corrected which version of TiKV was being used.

This is a list of commit hashes that I'm using (abridged):

  1. TiDB: 7d4c57b86 (HEAD, origin/tidb_cura) a hack to avoid data copy when cop request hit cop cache
  2. TiKV: 34bd27c1b (HEAD, origin/tikv_cura) support cura

I'm still currently getting an error of the same sort.

See: https://github.com/csm-yujinkim/tidb-experiments/issues/13

zanmato1984 commented 3 years ago

We are looking into it.

Could you upload the corresponding tidb.log and cura.log for us? Thanks.

ghost commented 3 years ago

No problem.

tidb.log

cura.log

zanmato1984 commented 3 years ago

I was following your steps to try reproducing the crash in my local. However I hit issues when doing "turn double": ERROR 8200 (HY000): Unsupported modify column: type double not match origin decimal(8,2)

Did you see this as well and how did you get over it?

BTW I was using the uploaded tidb binary.

EDIT: never mind, my uploaded tidb were lacking some changes.

ghost commented 3 years ago

Yeah, the bench.bash is in constant need of tweaking.

Here's some usage descriptions about the various scripts:

ghost commented 3 years ago

I've pushed some of my latest changes into that repository. Hopefully things aren't too different.

zanmato1984 commented 3 years ago

We may have some clues, but need your input to confirm. Could you give us the explain result of the all four queries? You can simply add explain ahead of the SQL statement in each query file.

ghost commented 3 years ago

Hi,

Query 5:

id  estRows task    access object   operator info
Sort_23 12.50   root(cura)      Column#48:desc
└─Projection_25 12.50   root(cura)      tpch.nation.n_name, Column#48
  └─HashAgg_28  12.50   root(cura)      group by:Column#51, funcs:sum(Column#49)->Column#48, funcs:firstrow(Column#50)->tpch.nation.n_name
    └─Projection_75 30.52   root(cura)      mul(tpch.lineitem.l_extendedprice, minus(1, tpch.lineitem.l_discount))->Column#49, tpch.nation.n_name, tpch.nation.n_name
      └─IndexMergeJoin_38   30.52   root        inner join, inner:TableReader_33, outer key:tpch.orders.o_orderkey, inner key:tpch.lineitem.l_orderkey, other cond:eq(tpch.supplier.s_suppkey, tpch.lineitem.l_suppkey)
        ├─HashJoin_42(Build)    24.41   root(cura)      inner join, equal:[eq(tpch.customer.c_custkey, tpch.orders.o_custkey)]
        │ ├─HashJoin_44(Build)  19.53   root(cura)      inner join, equal:[eq(tpch.supplier.s_nationkey, tpch.customer.c_nationkey)]
        │ │ ├─HashJoin_46(Build)    15.62   root(cura)      inner join, equal:[eq(tpch.nation.n_nationkey, tpch.supplier.s_nationkey)]
        │ │ │ ├─HashJoin_59(Build)  12.50   root(cura)      inner join, equal:[eq(tpch.region.r_regionkey, tpch.nation.n_regionkey)]
        │ │ │ │ ├─TableReader_64(Build) 10.00   root        data:Selection_63
        │ │ │ │ │ └─Selection_63    10.00   cop[tikv]       eq(tpch.region.r_name, "MIDDLE EAST")
        │ │ │ │ │   └─TableFullScan_62  10000.00    cop[tikv]   table:region    keep order:false, stats:pseudo
        │ │ │ │ └─TableReader_61(Probe) 10000.00    root        data:TableFullScan_60
        │ │ │ │   └─TableFullScan_60    10000.00    cop[tikv]   table:nation    keep order:false, stats:pseudo
        │ │ │ └─TableReader_66(Probe)   10000.00    root        data:TableFullScan_65
        │ │ │   └─TableFullScan_65  10000.00    cop[tikv]   table:supplier  keep order:false, stats:pseudo
        │ │ └─TableReader_68(Probe) 10000.00    root        data:TableFullScan_67
        │ │   └─TableFullScan_67    10000.00    cop[tikv]   table:customer  keep order:false, stats:pseudo
        │ └─TableReader_71(Probe)   250.00  root        data:Selection_70
        │   └─Selection_70  250.00  cop[tikv]       ge(tpch.orders.o_orderdate, 1994-01-01 00:00:00.000000), lt(tpch.orders.o_orderdate, 1995-01-01)
        │     └─TableFullScan_69    10000.00    cop[tikv]   table:orders    keep order:false, stats:pseudo
        └─TableReader_33(Probe) 1.00    root        data:TableRangeScan_32
          └─TableRangeScan_32   1.00    cop[tikv]   table:lineitem  range: decided by [tpch.supplier.s_suppkey tpch.orders.o_orderkey], keep order:true, stats:pseudo

Query 9:

id      estRows task    access object   operator info
Sort_25 8000.00 root(cura)              tpch.nation.n_name, Column#51:desc
└─Projection_27 8000.00 root(cura)              tpch.nation.n_name, Column#51, Column#53
  └─HashAgg_30  8000.00 root(cura)              group by:Column#51, tpch.nation.n_name, funcs:sum(Column#52)->Column#53, funcs:firstrow(tpch.nation.n_name)->tpch.nation.n_name, funcs:firstrow(Column#51)->Column#51
    └─Projection_31     15625.00        root(cura)              tpch.nation.n_name, extract(YEAR, tpch.orders.o_orderdate)->Column#51, minus(mul(tpch.lineitem.l_extendedprice, minus(1, tpch.lineitem.l_discount)), mul(tpch.partsupp.ps_supplycost, tpch.lineitem.l_quantity))->Column#52
      └─IndexMergeJoin_42       15625.00        root            inner join, inner:TableReader_37, outer key:tpch.lineitem.l_partkey, inner key:tpch.part.p_partkey
        ├─HashJoin_55(Build)    15625.00        root(cura)              inner join, equal:[eq(tpch.lineitem.l_orderkey, tpch.orders.o_orderkey)]
        │ ├─TableReader_98(Build)       10000.00        root            data:TableFullScan_97
        │ │ └─TableFullScan_97  10000.00        cop[tikv]       table:orders    keep order:false, stats:pseudo
        │ └─HashJoin_74(Probe)  12500.00        root(cura)              inner join, equal:[eq(tpch.lineitem.l_suppkey, tpch.partsupp.ps_suppkey) eq(tpch.lineitem.l_partkey, tpch.partsupp.ps_partkey)]
        │   ├─TableReader_96(Build)     10000.00        root            data:TableFullScan_95
        │   │ └─TableFullScan_95        10000.00        cop[tikv]       table:partsupp  keep order:false, stats:pseudo
        │   └─HashJoin_76(Probe)        10000.00        root(cura)              inner join, equal:[eq(tpch.supplier.s_suppkey, tpch.lineitem.l_suppkey)]
        │     ├─TableReader_94(Build)   10000.00        root            data:TableFullScan_93
        │     │ └─TableFullScan_93      10000.00        cop[tikv]       table:lineitem  keep order:false, stats:pseudo
        │     └─HashJoin_88(Probe)      10000.00        root(cura)              inner join, equal:[eq(tpch.nation.n_nationkey, tpch.supplier.s_nationkey)]
        │       ├─TableReader_92(Build) 25.00   root            data:TableFullScan_91
        │       │ └─TableFullScan_91    25.00   cop[tikv]       table:nation    keep order:false, stats:pseudo
        │       └─TableReader_90(Probe) 10000.00        root            data:TableFullScan_89
        │         └─TableFullScan_89    10000.00        cop[tikv]       table:supplier  keep order:false
        └─TableReader_37(Probe) 0.80    root            data:Selection_36
          └─Selection_36        0.80    cop[tikv]               like(tpch.part.p_name, "%dim%", 92)
            └─TableRangeScan_35 1.00    cop[tikv]       table:part      range: decided by [tpch.lineitem.l_partkey], keep order:true

Query 17:

id      estRows task    access object   operator info
Projection_16   1.00    root(cura)              div(Column#44, 7)->Column#45
└─StreamAgg_21  1.00    root            funcs:sum(tpch.lineitem.l_extendedprice)->Column#44
  └─HashJoin_24 0.25    root(cura)              inner join, equal:[eq(tpch.part.p_partkey, tpch.lineitem.l_partkey)], other cond:lt(tpch.lineitem.l_quantity, mul(0.2, Column#42))
    ├─IndexMergeJoin_35(Build)  0.25    root            inner join, inner:TableReader_30, outer key:tpch.lineitem.l_partkey, inner key:tpch.part.p_partkey
    │ ├─TableReader_39(Build)   10000.00        root            data:TableFullScan_38
    │ │ └─TableFullScan_38      10000.00        cop[tikv]       table:lineitem  keep order:false, stats:pseudo
    │ └─TableReader_30(Probe)   0.00    root            data:Selection_29
    │   └─Selection_29  0.00    cop[tikv]               eq(tpch.part.p_brand, "Brand#44"), eq(tpch.part.p_container, "WRAP PKG")
    │     └─TableRangeScan_28   1.00    cop[tikv]       table:part      range: decided by [tpch.lineitem.l_partkey], keep order:true
    └─HashAgg_47(Probe) 8000.00 root(cura)              group by:tpch.lineitem.l_partkey, funcs:avg(Column#48, Column#49)->Column#42, funcs:firstrow(tpch.lineitem.l_partkey)->tpch.lineitem.l_partkey
      └─TableReader_48  8000.00 root            data:HashAgg_43
        └─HashAgg_43    8000.00 cop[tikv]               group by:tpch.lineitem.l_partkey, funcs:count(tpch.lineitem.l_quantity)->Column#48, funcs:sum(tpch.lineitem.l_quantity)->Column#49
          └─TableFullScan_46    10000.00        cop[tikv]       table:lineitem  keep order:false, stats:pseudo

Query 18:

id      estRows task    access object   operator info
Projection_26   100.00  root(cura)              tpch.customer.c_name, tpch.customer.c_custkey, tpch.orders.o_orderkey, tpch.orders.o_orderdate, tpch.orders.o_totalprice, Column#52
└─TopN_29       100.00  root(cura)              tpch.orders.o_totalprice:desc, tpch.orders.o_orderdate, offset:0, count:100
  └─HashAgg_35  10000.00        root(cura)              group by:tpch.customer.c_custkey, tpch.customer.c_name, tpch.orders.o_orderdate, tpch.orders.o_orderkey, tpch.orders.o_totalprice, funcs:sum(tpch.lineitem.l_quantity)->Column#52, funcs:firstrow(tpch.customer.c_custkey)->tpch.customer.c_custkey, funcs:firstrow(tpch.customer.c_name)->tpch.customer.c_name, funcs:firstrow(tpch.orders.o_orderkey)->tpch.orders.o_orderkey, funcs:firstrow(tpch.orders.o_totalprice)->tpch.orders.o_totalprice, funcs:firstrow(tpch.orders.o_orderdate)->tpch.orders.o_orderdate
    └─IndexMergeJoin_44 10000.00        root            inner join, inner:TableReader_39, outer key:tpch.orders.o_custkey, inner key:tpch.customer.c_custkey
      ├─HashJoin_48(Build)      10000.00        root(cura)              inner join, equal:[eq(tpch.orders.o_orderkey, tpch.lineitem.l_orderkey)]
      │ ├─Selection_97(Build)   6400.00 root(cura)              gt(Column#50, 314)
      │ │ └─HashAgg_104 8000.00 root(cura)              group by:tpch.lineitem.l_orderkey, funcs:sum(Column#55)->Column#50, funcs:firstrow(tpch.lineitem.l_orderkey)->tpch.lineitem.l_orderkey
      │ │   └─TableReader_105   8000.00 root            data:HashAgg_98
      │ │     └─HashAgg_98      8000.00 cop[tikv]               group by:tpch.lineitem.l_orderkey, funcs:sum(tpch.lineitem.l_quantity)->Column#55
      │ │       └─TableFullScan_103     10000.00        cop[tikv]       table:lineitem  keep order:false, stats:pseudo
      │ └─MergeJoin_72(Probe)   12500.00        root            inner join, left key:tpch.orders.o_orderkey, right key:tpch.lineitem.l_orderkey
      │   ├─TableReader_63(Build)       10000.00        root            data:TableFullScan_62
      │   │ └─TableFullScan_62  10000.00        cop[tikv]       table:lineitem  keep order:true, stats:pseudo
      │   └─TableReader_61(Probe)       10000.00        root            data:TableFullScan_60
      │     └─TableFullScan_60  10000.00        cop[tikv]       table:orders    keep order:true, stats:pseudo
      └─TableReader_39(Probe)   1.00    root            data:TableRangeScan_38
        └─TableRangeScan_38     1.00    cop[tikv]       table:customer  range: decided by [tpch.orders.o_custkey], keep order:true
zanmato1984 commented 3 years ago

Hi Yujin, we roughly located the problem:

  1. Due to the small data set (tpch 1G, whereas we used 50G for demo in our article), tidb tends to use index join, which is not yet supported by cura;
  2. Unsupported operator, e.g. index join, will break the whole plan into several cura sub-plans, as you can see in your explain result there are "root" tasks splitting "root(cura)" tasks, whereas in our demo we somehow tuned the plan to be one complete cura plan;
  3. We have bugs dealing with data passing between cura sub-plans.

So before proceding to fix the bugs, I'm a little wondering what is your expectation about this experiment?

If you expect a perf comparison between GPU and CPU as our article did, you may want to directly try larger data set, i.e. tpch 20G or 50G. This way the "cura sub-plan" issue may be worked around.

If you expect this cura-empowered tidb to be in some production use, we couldn't promise that we can give our effort that far - we may fix the "cura sub-plan" issue, but there are way too much issues to address to make cura-empowered tidb production level.

What do you think?

ghost commented 3 years ago

Hi,

We're researching the use of our research in the terabyte scale (at least 1TiB or so), which means that certainly we would use more than 50G.

On the other hand, I had put "50" to the scale factor (?) figure in tidb-bench/tpch/Makefile: tbl

diff --git a/tpch/Makefile b/tpch/Makefile
index e5a2621..7d86c48 100644
--- a/tpch/Makefile
+++ b/tpch/Makefile
@@ -1,5 +1,5 @@
 tbl: dbgen
-       cd dbgen && ./dbgen -s 1
+       cd dbgen && ./dbgen -s 50
 dbgen:
        cd dbgen; make;
 load:

And then as I examine the nohup.out output, I notice that the transaction keeps dropping.

ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857712
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857663
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857769
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857627
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857660
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857627
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857660
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857627
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857660
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857627
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857660
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
ERROR 8028 (HY000) at line 1: Information schema is changed during the execution of the statement(for example, table definition may be updated by other DDL ran in parallel). If you see this error often, try increasing `tidb_max_delta_schema_count`. [try again later]
ERROR 8028 (HY000) at line 1: Information schema is changed during the execution of the statement(for example, table definition may be updated by other DDL ran in parallel). If you see this error often, try increasing `tidb_max_delta_schema_count`. [try again later]
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857627
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857660
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
ERROR 8004 (HY000) at line 1: Transaction is too large, size: 104857739
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
ERRORERROR 2013 (HY000) at line 1 2013 (HY000): Lost connection to MySQL server during query
 at line 1: Lost connection to MySQL server during query
ERROR 8028 (HY000) at line 1: Information schema is changed during the execution of the statement(for example, table definition may be updated by other DDL ran in parallel). If you see this error often, try increasing `tidb_max_delta_schema_count`. [try again later]
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
ERRORERROR 2013 (HY000) 2013 (HY000) at line 1 at line 1: Lost connection to MySQL server during query
: Lost connection to MySQL server during query
ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query

And as such my queries (Q9) fail. How may I solve this problem?

zanmato1984 commented 3 years ago

We hit that issue too. We used tiup to work this around:

tiup bench tpch prepare --db tpch_50 --sf 50 --threads 32 --analyze

Note that --analyze is necessary for the planner to give the reasonable plan.

But it really cost a lot of time. Give it a day or two.

And may I ask whether you are going to use tidb or some other engine yet to integrate with cura?

ghost commented 3 years ago

I see what you mean. I'll try that. However, doesn't that mean that the data type is going to be DECIMAL? Would we still have to convert it to DOUBLE?

We're using TiDB as a reproducible case study of using the GPU to do database operations. Ultimately, the goal is to find different yet memory-intensive (as opposed to compute-intensive) workloads for the GPU, and CURA is a good example for us.

zanmato1984 commented 3 years ago

I see what you mean. I'll try that. However, doesn't that mean that the data type is going to be DECIMAL? Would we still have to convert it to DOUBLE?

Right, sorry I forgot about that. We modified the schema file https://github.com/pingcap/tidb-bench/blob/master/tpch/dss.sql by replacing DECIMAL(*) with DOUBLE (and database name TPCH with TPCH_50) and creating database/table manually. Tiup will skip table creating if you created them ahead, and the data will go in DOUBLE columns with no pain.

We're using TiDB as a reproducible case study of using the GPU to do database operations. Ultimately, the goal is to find different yet memory-intensive (as opposed to compute-intensive) workloads for the GPU, and CURA is a good example for us.

Got you. In the meantime we'll try fixing the bugs mentioned before so that you can run on small data set.