Closed pveentjer closed 2 years ago
Which host selection policy are you using (the host-selection-policy
option)? Shard-awareness should work only with the default option, which is token-aware
. In that case, the gocql driver should open exactly one connection for each shard and will ignore the connection-count
parameter. See here for more details.
I'm using default.
The number of connections is not correctly determined with default settings and with explicit
go/bin/scylla-bench -workload uniform -mode read -partition-count 10000000 -nodes 172.31.10.9 -concurrency 1600 -duration 10m -clustering-row-count 1 -host-selection-policy=token-aware 2>&1 | tee -a scylla-bench-16-08-2021_11-21-07.log
The number of shard is 92. And I only see connections for the first 23 shards. So in total there are 23 connections even though there should be 92.
I'm using the scylla-bench master branch.
How did you install scylla-bench? Like this?
go install github.com/scylladb/scylla-bench
I just checked and this version indeed is not shard-aware. However, if you build scylla-bench from source it should be shard-aware:
git clone https://github.com/scylladb/scylla-bench
cd scylla-bench/
go install .
Please check if the version built from source works better.
I'm using the first approach.
When using '-host-selection-policy=token-aware' the connection count setting is not ignored btw. I set it to 92 and every shard has multiple connections (it seems 3 or 4 per shard).
I'll try out the other approach now.
I'm using the first approach.
When using '-host-selection-policy=token-aware' the connection count setting is not ignored btw. I set it to 92 and every shard has multiple connections (it seems 3 or 4 per shard).
I'm not exactly sure what is happening here, but I suspect that go install
builds scylla-bench using the original gocql/gocql driver, not the scylladb/gocql fork. Only the scylladb/gocql fork supports shard-awareness, gocql/gocql does not.
It seems like a bug :) Let me verify first with the custom build approach if shard aware works properly.
I see there are just a few cross shard operations. So it seems to be shard aware.
I did some digging and it seems that my theory about go install
not using our fork is correct. We are using a replace directive to substitute our fork in place of the upstream (which is a recommended practice according to our fork's readme). However, commands like go get
and go install
do not honor the replace
directive - it is expected behavior: https://github.com/golang/go/issues/30354 - which resulted in you accidentally using the non-shard-aware driver.
I don't think this issue should be classified as a bug in scylla-bench, because the problem was caused by the combination of go
tool's idiosyncrasies and the way our fork is supposed to be substituted. However, the proper way to install scylla-bench should be documented in the readme - now, the recommended way is go get
, which is not what we want! I'll send a PR with an improved instruction.
If you can update the documentation for installing, then I'm fine with closing this issue.
I was wondering why the throughput numbers were so low compared to cassandra-stress (from Scylla) and the number of connections is way lower. So by default scylla-bench isn't shard aware, it doesn't open a connection to every shard and route traffic to the right shard.