cockroachdb / loadgen

CockroachDB load generators
Apache License 2.0
30 stars 25 forks source link

Can not load cockroachdb with loadgen-tpcc #176

Open arruw opened 6 years ago

arruw commented 6 years ago

Here are step that I am executing:

1. Create Database "loadgentpc"

$ docker run --rm -it --network cockroachdb-network cockroachdb/cockroach sql --insecure --url postgresql://root@roach1:26257?sslmode=disable --execute "CREATE DATABASE loadgentpc;"
# Server version: CockroachDB CCL v1.1.6 (linux amd64, built 2018/03/12 17:58:05, go1.8.3) (same version as client)
# Cluster ID: 32c0b615-7d26-4169-a63c-ceb6ef56d6ff
CREATE DATABASE

2. Check if DB was created

$ docker run --rm -it --network cockroachdb-network cockroachdb/cockroach sql --insecure --url postgresql://root@roach1:26257?sslmode=disable --execute "SHOW DATABASES;"
# Server version: CockroachDB CCL v1.1.6 (linux amd64, built 2018/03/12 17:58:05, go1.8.3) (same version as client)
# Cluster ID: 32c0b615-7d26-4169-a63c-ceb6ef56d6ff
+--------------------+
|      Database      |
+--------------------+
| crdb_internal      |
| information_schema |
| loadgentpc         |
| pg_catalog         |
| system             |
+--------------------+
(5 rows)

3. Load DB with loadgen-tpcc

$ docker run --rm -it --network cockroachdb-network cockroachdb/loadgen-tpcc postgresql://root@roach1:26257/loadgentpc?sslmode=disable -drop -load -v -tolerate-errors
_time______opName__ops/s(inst)__ops/s(cum)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
   1s    newOrder          0.0         0.0      0.0      0.0      0.0      0.0
   2s    newOrder          0.0         0.0      0.0      0.0      0.0      0.0
   3s    newOrder          0.0         0.0      0.0      0.0      0.0      0.0
2018/03/24 16:06:58 error in payment: dial tcp 127.0.0.1:26257: getsockopt: connection refused

This step throws error, am not sure why is it trying to connect to localhost?

jordanlewis commented 6 years ago

@matjazmav I think this is probably an issue with the way you're invoking the tool. Can you try moving the connection string to the end of the command arguments and enclosing it in quotes? I haven't seen this problem before.

arruw commented 6 years ago

@jordanlewis Thanks that was it.

It took me about 1.5-2 hours to load database with one error at the end. Is this normal, it seems to me that this is too slow? Generated database is only 204.5 MiB in size.

$ docker run --rm -it --network cockroachdb-network cockroachdb/loadgen-tpcc -drop -load -v -tolerate-errors "postgresql://root@proxy:26257/loadgentpcc?sslmode=disable"
Starting TPC-C load generator
connecting to db: postgresql://root@proxy:26257/loadgentpcc?sslmode=disable
Created 9 tables
Loaded 100000/100000 items
TPCCLoadItem      100000            256092.4 ns/op
Loading warehouse 1/1
Loaded 100000/100000 stocks
TPCCLoadStock     100000            902623.6 ns/op
Loading district 1/10...
Loaded 3000/3000 customers
TPCCLoadCustomer            3000           1582308.0 ns/op
Loaded 3000/3000 orders
TPCCLoadOrder       3000         175797725.1 ns/op
Loading district 2/10...
Loaded 3000/3000 customers
TPCCLoadCustomer            3000           2062199.8 ns/op
Loaded 3000/3000 orders
TPCCLoadOrder       3000         186944014.4 ns/op
Loading district 3/10...
Loaded 3000/3000 customers
TPCCLoadCustomer            3000           2458609.4 ns/op
Loaded 3000/3000 orders
TPCCLoadOrder       3000         179290871.9 ns/op
Loading district 4/10...
Loaded 3000/3000 customers
TPCCLoadCustomer            3000           1918442.6 ns/op
Loaded 3000/3000 orders
TPCCLoadOrder       3000         179237505.8 ns/op
Loading district 5/10...
Loaded 3000/3000 customers
TPCCLoadCustomer            3000           2539176.7 ns/op
Loaded 3000/3000 orders
TPCCLoadOrder       3000         179815692.5 ns/op
Loading district 6/10...
Loaded 3000/3000 customers
TPCCLoadCustomer            3000           1621781.0 ns/op
Loaded 3000/3000 orders
TPCCLoadOrder       3000         175710451.3 ns/op
Loading district 7/10...
Loaded 3000/3000 customers
TPCCLoadCustomer            3000           2135898.3 ns/op
Loaded 3000/3000 orders
TPCCLoadOrder       3000         186140439.3 ns/op
Loading district 8/10...
Loaded 3000/3000 customers
TPCCLoadCustomer            3000           1848584.3 ns/op
Loaded 3000/3000 orders
TPCCLoadOrder       3000         173663109.2 ns/op
Loading district 9/10...
Loaded 3000/3000 customers
TPCCLoadCustomer            3000           2129434.3 ns/op
Loaded 3000/3000 orders
TPCCLoadOrder       3000         193399276.9 ns/op
Loading district 10/10...
Loaded 3000/3000 customers
TPCCLoadCustomer            3000           2347925.5 ns/op
Loaded 3000/3000 orders
TPCCLoadOrder       3000         181038431.3 ns/op
Loaded warehouse in 5595.1s
panic: Couldn't exec create index customer_idx on customer (c_w_id, c_d_id, c_last, c_first): pq: duplicate: index "customer_idx" in the middle of being added, not yet public
goroutine 1 [running]:
main.loadSchema(0xc4200900a0, 0xc420100100)
        /go/src/github.com/cockroachdb/loadgen/tpcc/ddls.go:249 +0x497
main.main()
        /go/src/github.com/cockroachdb/loadgen/tpcc/main.go:128 +0x258

Here is screen shot of admin dashboard, SQL 99 percentile latency is around 2s in average, 90 percentile is around 1s. I know this is relative to hardware, but in my case I have running 3 nodes each on one machine in docker swarm. Each machine has 4 cores, 4GB ram, 8gb swap, non ssd disk, 100baseT/Full (old PCs). I was expecting better performance. Any idea how to boost this?

image

jordanlewis commented 6 years ago

No, that's very slow. That loader program isn't designed to be high throughput on a cluster, though. It would be better to generate the data as a CSV and import using the higher efficiency distributed IMPORT command that cockroach supports. We've implemented something like this in the workload command inside the main cockroach repo, as well. This repo is being phased out as a result.

This is all under fairly active development, so YMMV, but check out cockroachdb/cockroach and run make bin/workload to get started. It has a decent CLI help menu, and it does support very fast fixture creation and loading using the (enterprise) distributed backup and restore features. We've loaded ~terabyte TPCC datasets onto ~30 node clusters in just a couple of hours using this tool.

We should be able to support a non-enterprise loading of data using just the distributed import feature as well, now that I think about it. I'll file an issue about that in our main repo.

Sorry for our mess around here - we're going to be publishing more information on our tpcc implementation and how to benchmark Cockroach with it within the next few weeks, on our blog. Keep an eye out.

arruw commented 6 years ago

@jordanlewis I think this IMPORT command is CRDB specific, right? It won't work with standard postgres database. It'll be nice to have one tool to compare postgres, citus, crdb...

Do you maybe know what is going on with this error:

...
TPCCLoadOrder       3000          85274279.7 ns/op
Loaded warehouse in 3557.2s
Created 10 indexes

Created 20 indexes

panic: Couldn't exec create index on stock(s_i_id): driver: bad connection

goroutine 1 [running]:
main.loadSchema(0xc42008a140, 0xc420000100)
        /home/matjazmav/go/src/github.com/cockroachdb/loadgen/tpcc/ddls.go:265 +0x564
main.main()
        /home/matjazmav/go/src/github.com/cockroachdb/loadgen/tpcc/main.go:183 +0x758
jordanlewis commented 6 years ago

The IMPORT is CRDB specific, yes, but TPCC isn't designed to test the speed of loading the data. It's designed to test the throughput of the TPCC transactions on a loaded dataset.

Do the server logs say anything? Looks like the server was turned off or the connection otherwise removed.

arruw commented 6 years ago

@jordanlewis I'm trying build bin/workload with builder.sh but after that my ./bin directory is still empty.

~/go/src/github.com/cockroachdb/cockroach$ ./build/builder.sh make bin/workload
GOPATH set to /go
Running make with -j4
github.com/cockroachdb/cockroach/vendor/github.com/tylertreat/hdrhistogram-writer
github.com/cockroachdb/cockroach/pkg/ccl/workloadccl
github.com/cockroachdb/cockroach/pkg/ccl/workloadccl/roachmartccl
github.com/cockroachdb/cockroach/pkg/workload/bank
github.com/cockroachdb/cockroach/pkg/workload/jsonload
github.com/cockroachdb/cockroach/pkg/workload/kv
github.com/cockroachdb/cockroach/pkg/workload/singlequery
github.com/cockroachdb/cockroach/pkg/workload/tpcc
github.com/cockroachdb/cockroach/pkg/workload/tpch
github.com/cockroachdb/cockroach/pkg/workload/ycsb
github.com/cockroachdb/cockroach/pkg/ccl/workloadccl/allccl
github.com/cockroachdb/cockroach/pkg/cmd/workload

~/go/src/github.com/cockroachdb/cockroach$ ls -al ./bin
total 8
drwxr-xr-x  2 root      root      4096 Apr 14 10:25 .
drwxrwxr-x 18 matjazmav matjazmav 4096 Apr 14 10:33 ..

What am I doing wrong?

jordanlewis commented 6 years ago

@matjazmav the trouble is that the builder image builds for a different architecture than your computer, so it probably appears in a directory that's bin-prefixed, like bin-<arch> or something like that. I forget the details.

If you build with the builder, by the way, you'll likely have to run workload with the builder as well.

arruw commented 6 years ago

@jordanlewis I found it in bin.docker_amd64 directory, thanks.

How can build it for my computer architecture? I did it like this ./build/builder.sh build bin/workload TYPE=release-linux-gnu. And how can I run it with builder?

I went through issues and found that you are running nightly benchmarks. Are those public available, where can I found them?

jordanlewis commented 6 years ago

@matjazmav to build it for your computer architecture, it's easiest to just not use the builder image, which is designed to be a containerized way to build a linux image. So you can just do make bin/workload on your computer after installing required dependencies.