pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.23k stars 5.84k forks source link

Disk-based `distinct` operation #14395

Open XuHuaiyu opened 4 years ago

XuHuaiyu commented 4 years ago

Feature Request

Is your feature request related to a problem? Please describe:

For SQL like "select count(distinct a) from t", TiDB uses an in-memory IntSet/ DecimalSet/ StringSet/ ... to check whether a value is distinct. The underlying implementation of XSet is a go map actually, we may need a disk-based map to avoid OOM. After supporting the disk-based map, the XSet should be spilled over to disk when mem-quota-query in tidb.toml is exceeded.

Describe the feature you'd like:

Describe alternatives you've considered:

Teachability, Documentation, Adoption, Migration Strategy:

SunRunAway commented 4 years ago

We should consider not using disk-base map because it has large cost during ramdom disk IO. Consider a method like grace hash join?

jianzhiyao commented 3 years ago

/assign

jianzhiyao commented 3 years ago

/assign

jianzhiyao commented 3 years ago

@XuHuaiyu Do you think LSM-Tree is a good idea for building disk-based map?

SunRunAway commented 3 years ago

@XuHuaiyu Do you think LSM-Tree is a good idea for building disk-based map?

I doubt no. the performance of any disk-based map is horrible because of random IO. We'd better consider another method.

jianzhiyao commented 3 years ago

@XuHuaiyu Do you think LSM-Tree is a good idea for building disk-based map?

I doubt no. the performance of any disk-based map is horrible because of random IO. We'd better consider another method.

Level DB is a key-value database based on LSM-tree. LSM-tree can turn random I/O into sequential I/O.

XuHuaiyu commented 3 years ago

@jianzhiyao @wshwsh12 has an idea, can you describe it here?

jianzhiyao commented 3 years ago

I have talked about this with @wshwsh12 ,but I think LSM-tree is a another good solution for this, because even rocksdb base on it.

XuHuaiyu commented 3 years ago

@jianzhiyao If we use an LSM-tree to store the hash table, we'll have to take a random I/O operation each time when we want to fetch a key-value pair from the hashtable. I doubt that the performance will be bad.

jianzhiyao commented 3 years ago

@jianzhiyao If we use an LSM-tree to store the hash table, we'll have to take a random I/O operation each time when we want to fetch a key-value pair from the hashtable. I doubt that the performance will be bad.

first, lsm-tree is used widely in many product enviroment. and for that issue we can only store the key in lsm-tree to improve performance. After all, we can make a benchmark testing for random keys withlsm-tree.

SunRunAway commented 3 years ago

using LSM tree to simulate temporary hash table is horrible for read, each read will come to random io or may scan the whole table.

jianzhiyao commented 3 years ago

using LSM tree to simulate temporary hash table is horrible for read, each read will come to random io or may scan the whole table.

Got it.