apache / shardingsphere

Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
Apache License 2.0
19.78k stars 6.7k forks source link

Properties of VolumeBasedRangeShardingAlgorithm VOLUME_RANGE is not clear #20058

Open TeslaCN opened 2 years ago

TeslaCN commented 2 years ago

Question

My scenario is range sharding.

ID distributes in [1, 32]

Range sharding into 4 data sources: ds_0: [1, 8] ds_1: [9, 16] ds_2: [17, 24] ds_3: [25, 32]

I thought the configuration was:

    shardingAlgorithms:
      warehouse:
        type: VOLUME_RANGE
        props:
          range-lower: 1
          range-upper: 33
          sharding-volume: 8

But the proper configuration is the following:

    shardingAlgorithms:
      warehouse:
        type: VOLUME_RANGE
        props:
          range-lower: 9
          range-upper: 25
          sharding-volume: 8

The properties may confuse users.

github-actions[bot] commented 1 year ago

Hello , this issue has not received a reply for several days. This issue is supposed to be closed.

TeslaCN commented 1 year ago

Since this issue has been inactive for a while, I'm closing it.

strongduanmu commented 1 year ago

@TeslaCN Thank you for your feedback, I will check this logic.

strongduanmu commented 1 year ago

Hi @TeslaCN, I took a look at the implementation of the VOLUME_RANGE algorithm. It is an auto sharding algorithm and is recommended to be used in the auto sharding configuration. Users do not need to care about the underlying actual table.

In addition, because the range sharding algorithm has values outside the boundary, two additional shardings are added to the algorithm to store these values outside the boundary. If we use this algorithm in the standard sharding algorithm, it will indeed be confusing, and I will think about how to optimize it.

TeslaCN commented 1 year ago

Thank you @strongduanmu