pingcap / tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://pingcap.com
Apache License 2.0
37.04k stars 5.82k forks source link

Support some analytical functions #19405

Open leiysky opened 4 years ago

leiysky commented 4 years ago

Feature Request

Is your feature request related to a problem? Please describe: In some analytic scenes, we cannot write SQL conveniently with funtions currently supported by TiDB.

For example, if you want to get a n-quantile, you have to write a SQL as follows, where x represents count_of_records / n:

SELECT value FROM t 
ORDER BY value
LIMIT x, 1 

While in Clickhouse, a function quantileExact is enough:

SELECT quantileExact(1/n)(value) FROM t

Describe the feature you'd like: Support following functions:

It had better make the functions pushed to TiFlash.

Describe alternatives you've considered: None.

Teachability, Documentation, Adoption, Migration Strategy: None.

zz-jason commented 4 years ago

The feature request looks good to me. I'm thinking about if there any other possible names for these functions. Do other databases have the same functions, what's their function name for these functions?

leiysky commented 4 years ago

In Spark SQL there is a approx_percentile function(https://spark.apache.org/docs/latest/api/sql/index.html#approx_percentile) to calculate quantile value.

There is a function ROUND(date) in Oracle(https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions136.htm), which is more generic but complicated and may break current syntax rules.

@zz-jason FYI.

zz-jason commented 4 years ago

Since we have supported APPROX_COUNT_DISTINCT(), how about using APPROX_PERCENTILE() to implement quantileExact() function in Clickhouse?

leiysky commented 4 years ago

@zz-jason That's fine.