Open hidehalo opened 3 years ago
@kennytm Shall we label it type/bug
?
I don't see any point disabling support of DISTINCT. We provide the correct result with DISTINCT, compatible with PostgreSQL.
create table a (a int);
insert into a values (1), (2), (2), (3), (3), (3);
select var_pop(all a), var_pop(distinct a), var_pop(a) from a;
-- both PostgreSQL and TiDB return 0.5555…, 0.6666…, 0.5555…
The additional DISTINCT keyword, although prohibited by ISO 9075*, is well supported in PostgreSQL (and CockroachDB), BigQuery, Snowflake and Microsoft Transact-SQL and IBM DB2. Meanwhile, MySQL (and MariaDB), Oracle, SingleStore (formerly MemSQL) followed ISO 9075 and disallowed DISTINCT. SQLite3 does not have the standard deviation/variance aggregate functions.
ISO/IEC 9075-2:201? §4.16.4: Neither DISTINCT nor ALL are allowed to be specified for VAR_POP, VAR_SAMP, STDDEV_POP, or STDDEV_SAMP; redundant duplicates are not removed when computing these functions.
Edit: Oracle's non-standard VARIANCE and STDDEV functions do allow DISTINCT, but the standard ones with _POP and _SAMP suffixes don't :shrug:.
I don't see any point disabling support of DISTINCT. We provide the correct result with DISTINCT, compatible with PostgreSQL.
create table a (a int); insert into a values (1), (2), (2), (3), (3), (3); select var_pop(all a), var_pop(distinct a), var_pop(a) from a; -- both PostgreSQL and TiDB return 0.5555…, 0.6666…, 0.5555…
The additional DISTINCT keyword, although prohibited by ISO 9075*, is well supported in PostgreSQL (and CockroachDB), BigQuery, Snowflake and Microsoft Transact-SQL and IBM DB2. Meanwhile, MySQL (and MariaDB), ~Oracle~, SingleStore (formerly MemSQL) followed ISO 9075 and disallowed DISTINCT. SQLite3 does not have the standard deviation/variance aggregate functions.
ISO/IEC 9075-2:201? §4.16.4: Neither DISTINCT nor ALL are allowed to be specified for VAR_POP, VAR_SAMP, STDDEV_POP, or STDDEV_SAMP; redundant duplicates are not removed when computing these functions.
Edit: Oracle's non-standard VARIANCE and STDDEV functions do allow DISTINCT, but the standard ones with _POP and _SAMP suffixes don't 🤷.
@kennytm Yes, I have notice that TiDB provide the correct result with DISTINCT
.
I am not familiar with those DB&standards, this issue come from https://github.com/pingcap/tidb/issues/19499, I think maybe it is not necessary to TiDB.
This change is about function support, I am not suitable for making decisions, maybe we should close those relative issues?
Bug Report
On MySQL 8, aggregate functions below do not support distinct options:
(ref https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html)
E.g:
Executed result of SQL
select std(distinct c) from t
in MySQL 8 would get error feedback like:but, it would execute successfully in TiDB.