cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.12k stars 3.81k forks source link

sql: automatic partial stats collection does not work as expected #134460

Open mgartner opened 6 days ago

mgartner commented 6 days ago

In the example below, I expect the second INSERT to trigger a partial stats collection, but it does not.

SET CLUSTER SETTING sql.stats.automatic_partial_collection.enabled = true;
SET CLUSTER SETTING sql.stats.automatic_partial_collection.min_stale_rows = 5;
SET CLUSTER SETTING sql.stats.automatic_collection.min_stale_rows = 100000000000;

CREATE TABLE t (
  k INT PRIMARY KEY,
  a INT, 
  INDEX (a)
)
WITH (sql_stats_automatic_partial_collection_enabled = true);

INSERT INTO t
SELECT i, i FROM generate_series(1, 1000) AS g(i);

CREATE STATISTICS __auto__ FROM t;

SHOW STATISTICS FOR TABLE t;
--   statistics_name | column_names |           created            | row_count | distinct_count | null_count | avg_size | partial_predicate |    histogram_id     | full_histogram_id
-- ------------------+--------------+------------------------------+-----------+----------------+------------+----------+-------------------+---------------------+--------------------
--   __auto__        | {k}          | 2024-11-06 19:12:08.96167+00 |      1000 |           1000 |          0 |        3 | NULL              | 1018593047228645377 |                 0
--   __auto__        | {a}          | 2024-11-06 19:12:08.96167+00 |      1000 |           1000 |          0 |        3 | NULL              | 1018593047233822721 |                 0

SELECT pg_sleep(5);

INSERT INTO t
SELECT i, i FROM generate_series(1001, 2000) AS g(i);

SELECT pg_sleep(5);

SHOW STATISTICS FOR TABLE t;
--   statistics_name | column_names |           created            | row_count | distinct_count | null_count | avg_size | partial_predicate |    histogram_id     | full_histogram_id
-- ------------------+--------------+------------------------------+-----------+----------------+------------+----------+-------------------+---------------------+--------------------
--   __auto__        | {k}          | 2024-11-06 19:12:08.96167+00 |      1000 |           1000 |          0 |        3 | NULL              | 1018593047228645377 |                 0
--   __auto__        | {a}          | 2024-11-06 19:12:08.96167+00 |      1000 |           1000 |          0 |        3 | NULL              | 1018593047233822721 |                 0
-- (2 rows)

Jira issue: CRDB-44089

blathers-crl[bot] commented 6 days ago

Hi @mgartner, please add branch-* labels to identify which branch(es) this C-bug affects.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.