scylladb / gocql

Package gocql implements a fast and robust ScyllaDB client for the Go programming language.
https://docs.scylladb.com/stable/using-scylla/drivers/cql-drivers/scylla-go-driver.html
BSD 3-Clause "New" or "Revised" License
181 stars 57 forks source link

Extends gocql API for scylla shard aware info #164

Closed moguchev closed 3 months ago

moguchev commented 6 months ago

In our case, we encountered a problem when we need to insert and read more than 500 different keys (from different partitions) at a time in scyllaDB in one user request.

The insertion (BATCH request) still worked well, but reading queries (SELECT WHERE IN) left much to be desired.

We needed to extend the gocql API to be able to split keys/queries across hosts (or shards) on the side of our application and send queries to scyllaDB. This allowed us to reduce response time.

It might be useful to give this functionality. Well, or you can suggest a more elegant implementation than what I wrote :)

sylwiaszunejko commented 5 months ago

@moguchev Please squash your commits into set of independent changes. Correction commit affecting files changed in other PR commit should be squashed.

moguchev commented 5 months ago

@moguchev Please squash your commits into set of independent changes. Correction commit affecting files changed in other PR commit should be squashed.

done

sylwiaszunejko commented 5 months ago

@avelanarius Could you please review this PR to ensure everything is correct?

moguchev commented 5 months ago

I've changed go:build tags in unit tests and in integration tests. Also I don't know why ./integration.sh tablet fails. My function should not affect on it.

moguchev commented 5 months ago

Also it's better to use ShardAwareRoutingInfo with enhancement from #165 because HostName will have a lot of calls in app.

sylwiaszunejko commented 4 months ago

@avelanarius Could you please review this PR to ensure everything is correct?

@avelanarius ping

sylwiaszunejko commented 3 months ago

@dkropachev Could you look at this PR?

dkropachev commented 3 months ago

@moguchev , thanks for contribution, I see you put lots of effort into it.

@sylwiaszunejko, I think we should not merge it as it is, functionally it is correct, except not covering tablets feature. But there are major concerns:

  1. It breaks single responsibility principle in regarding of what holds/generates routing information, currently it is tokenAwareHostPolicy but this PR makes Session do it as well.
  2. Which leads to bad maintainability, for example if tablet feature logic have to be replicated in GetShardAwareRoutingInfo
  3. This code is also does lot's of assumptions regarding configuration, for instance partitioner, HostSelectionPolicy and ConnPicker

Overall case is very valueable, not only for this particular driver. Looking at the code, i beileve that we should extract and expose code that calculates routing information as it is done here, but on HostSelectionPolicy level and have an API for it on Session level, also to provide nice API to solve problem that is outlined in comments of GetShardAwareRoutingInfo, this API probably to be exposed on Session level as well

Update

Just FYI: rust-driver related issues: https://github.com/scylladb/scylla-rust-driver/issues/468, https://github.com/scylladb/scylla-rust-driver/pull/975, https://github.com/scylladb/scylla-rust-driver/pull/944

moguchev commented 3 months ago

Thanks! I hope you'll add this in API

dkropachev commented 3 months ago

Thanks! I hope you'll add this in API

Thanks for you contribution, it give us very good clue, I have created an issue based on this PR, we would be happy to see your recommendations/comments there.