cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.11k stars 3.81k forks source link

colexec: emit batches with large rows early #109309

Open cockroach-teamcity opened 1 year ago

cockroach-teamcity commented 1 year ago

Fix the vectorized engine to emit batches eagerly in case they have wide rows:

Some reproductions ```sql CREATE TABLE l (l_k INT, l_v STRING, INDEX (l_k) STORING (l_v)); CREATE TABLE r (r_k INT, r_v STRING, INDEX (r_k) STORING (r_v)); INSERT INTO l SELECT i // 100, repeat('l', 60000) FROM generate_series(0, 899) AS g(i); INSERT INTO r SELECT i // 10, repeat('r', 60000) FROM generate_series(0, 99) AS g(i); -- sizes in the next two queries can be tweaked for any target memory usage INSERT INTO l SELECT 100000, repeat('l', 1000000) FROM generate_series(0, 99) AS g(i); INSERT INTO r SELECT 100000, repeat('l', 1000000) FROM generate_series(0, 19) AS g(i); ANALYZE l; ANALYZE r; EXPLAIN ANALYZE (VERBOSE) SELECT length(l_v), length(r_v) FROM l INNER MERGE JOIN r ON l_k = r_k; ``` ```sql CREATE TABLE t (s STRING); INSERT INTO t SELECT repeat('a', 50) FROM generate_series(1, 1000); ANALYZE t; EXPLAIN ANALYZE SELECT concat_agg(t1.s ORDER BY t1.s DESC) FROM t AS t1, t AS t2; ```

Jira issue: CRDB-30864

yuzefovich commented 1 year ago

Node 1 OOMed:


fatal error: runtime: out of memory

goroutine 4294 [running]:
runtime.systemstack_switch()
    GOROOT/src/runtime/asm_amd64.s:459 fp=0xc004d271a8 sp=0xc004d271a0 pc=0x4cb9a0
runtime.(*mheap).alloc(0x57633e000?, 0x2bb19f?, 0x4?)
    GOROOT/src/runtime/mheap.go:912 +0x65 fp=0xc004d271f0 sp=0xc004d271a8 pc=0x488ee5
runtime.(*mcache).allocLarge(0xc001a9cd00?, 0x57633d400, 0x1)
    GOROOT/src/runtime/mcache.go:233 +0x85 fp=0xc004d27240 sp=0xc004d271f0 pc=0x4762c5
runtime.mallocgc(0x57633d400, 0x5663460, 0x1)
    GOROOT/src/runtime/malloc.go:1029 +0x57e fp=0xc004d272b8 sp=0xc004d27240 pc=0x46c51e
runtime.makeslice(0x469401?, 0xc00bd43600?, 0xc004d27340?)
    GOROOT/src/runtime/slice.go:103 +0x52 fp=0xc004d272e0 sp=0xc004d272b8 pc=0x4b1092
github.com/cockroachdb/cockroach/pkg/col/coldata.(*element).setNonInlined(0x60023e0?, {0xc06d06a000, 0x15d8cf5, 0x1?}, 0xc002e845c0)
    github.com/cockroachdb/cockroach/pkg/col/coldata/bytes.go:152 +0x10c fp=0xc004d27350 sp=0xc004d272e0 pc=0x10082ac
github.com/cockroachdb/cockroach/pkg/col/coldata.(*element).copy(0xc0074c5a40?, {{0x0, 0x15d8cf5, 0x15d8cf5}, {0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, ...}, ...)
    github.com/cockroachdb/cockroach/pkg/col/coldata/bytes.go:125 +0x6e fp=0xc004d27388 sp=0xc004d27350 pc=0x100812e
github.com/cockroachdb/cockroach/pkg/col/coldata.(*Bytes).Copy(...)
    github.com/cockroachdb/cockroach/pkg/col/coldata/bytes.go:236
github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecjoin.(*crossJoinerBase).buildFromLeftInput.func1()
    github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecjoin/crossjoiner.eg.go:850 +0x1bad fp=0xc004d27b08 sp=0xc004d27388 pc=0x2a4b26d
github.com/cockroachdb/cockroach/pkg/sql/colmem.(*Allocator).PerformOperation(0xc004d27b78?, {0xc008519500, 0x50, 0xc004cb1cc0?}, 0xc004d27b98?)
    github.com/cockroachdb/cockroach/pkg/sql/colmem/allocator.go:460 +0x49 fp=0xc004d27b60 sp=0xc004d27b08 pc=0x23d3229
github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecjoin.(*crossJoinerBase).buildFromLeftInput(0xc006d5f1e0, {0xc002f24000?, 0x4fc56a?}, 0x0)
    github.com/cockroachdb/cockroach/bazel-out/k8-opt/bin/pkg/sql/colexec/colexecjoin/crossjoiner.eg.go:47 +0xc5 fp=0xc004d27bd0 sp=0xc004d27b60 pc=0x2a49665
github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecjoin.(*crossJoiner).Next(0xc000c1c480)
    github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecjoin/crossjoiner.go:118 +0x1c9 fp=0xc004d27c20 sp=0xc004d27bd0 pc=0x2a41ae9
...
msirek commented 1 year ago

On a local 3-node cluster on my macbook, I get an error instead of an OOM:

> UPDATE
    defaultdb.public.seed AS "tab%q2381"
SET
    _int8 = "tab%q2381"._int4
FROM
    defaultdb.public.seed AS tab2382,
    defaultdb.public.seed AS tab2383,
    defaultdb.public.seed AS tab2384,
    defaultdb.public.seed AS "t""ab\\U000E86972385"
WHERE
    true;
ERROR: cross-joiner-8-unlimited-11: memory budget exceeded: 11707112960 bytes requested, 95646205 currently allocated, 8589956096 bytes in budget
SQLSTATE: 53200
msirek commented 1 year ago

There are 2 issues here:

  1. Why is the cross join OOMing instead of hitting a memory budget?
  2. We should be testing more join combinations of UPDATE ... FROM than just cross join, or multi-table join with a single predicate (right now there's a 50-50 chance of picking all cross joins vs. generating a WHERE clause, and the WHERE clauses which are generated aren't likely to generate useful join predicates).
msirek commented 1 year ago

On a gceworker this times out:

> UPDATE
    defaultdb.public.seed AS "tab%q2381"
SET
    _int8 = "tab%q2381"._int4
FROM
    defaultdb.public.seed AS tab2382,
    defaultdb.public.seed AS tab2383,
    defaultdb.public.seed AS tab2384,
    defaultdb.public.seed AS "t""ab\\U000E86972385"
WHERE
    true;
ERROR: query execution canceled due to statement timeout
SQLSTATE: 57014

If a statement timeout is not set, the statement completes in 1300 seconds on a gceworker. Maybe a system with more CPU horsepower could allocate memory more quickly and hit the OOM. Not sure how the nightly teamcity test systems are configured.

msirek commented 1 year ago

For this issue, try running this with a cluster configured the same as nightly roach tests, which may be 4 node, 4 cpu, 16 GiB (verify this).

msirek commented 1 year ago

OK, I was able to reproduce this, but only if cluster nodes are created without a local ssd:

roachprod create $CLUSTER -n 4 --clouds=gce --gce-machine-type=n1-standard-4 --local-ssd=false
roachprod ssh $CLUSTER sudo apt-get install libresolv-wrapper
roachprod put $CLUSTER cockroach
roachprod start $CLUSTER

Setup and test script:

script.txt

msirek commented 1 year ago

I tried setting:

SET distsql_workmem = '1MiB';

for this test case. This keeps the OOM from happening. I'm curious why the default setting of 64MiB is not preventing the OOM. Maybe cross join is not accurately tracking memory usage.

msirek commented 1 year ago

Allocated space:

Image

DrewKimball commented 1 year ago

The crossjoiner doesn't have a way to stop filling in the batch partway through if the rows are larger than expected. Decreasing the limit probably just decreases the number of rows allocated in the output batch enough to keep the memory usage smaller. We do have places where we check the memory usage each time a row is set for a batch (see SetAccountingHelper usages), so the action item here might be to do that for the crossjoiner as well.

msirek commented 1 year ago

Stack trace of OOM on new build:

goroutine 14785 [running]:
runtime.systemstack_switch()
        src/runtime/asm_amd64.s:463 fp=0xc0023153e8 sp=0xc0023153e0 pc=0x4bf720
runtime.(*mheap).alloc(0x5758d8000?, 0x2bac6c?, 0x12?)
        GOROOT/src/runtime/mheap.go:955 +0x65 fp=0xc002315430 sp=0xc0023153e8 pc=0x47a265
runtime.(*mcache).allocLarge(0x6004431219600443?, 0x5758d6c00, 0x1)
        GOROOT/src/runtime/mcache.go:234 +0x85 fp=0xc002315478 sp=0xc002315430 pc=0x468085
runtime.mallocgc(0x5758d6c00, 0x577bd20, 0x1)
        GOROOT/src/runtime/malloc.go:1053 +0x4fe fp=0xc0023154e0 sp=0xc002315478 pc=0x45ed9e
runtime.makeslice(0x1960044312196004?, 0x443121960044312?, 0x1219600443121960?)
        GOROOT/src/runtime/slice.go:103 +0x52 fp=0xc002315508 sp=0xc0023154e0 pc=0x4a26f2
github.com/cockroachdb/cockroach/pkg/col/coldata.(*element).setNonInlined(0x0?, {0xc02a6a8000, 0x15d635b, 0x0?}, 0xc0020f2880)
        github.com/cockroachdb/cockroach/pkg/col/coldata/bytes.go:152 +0x10c fp=0xc002315578 sp=0xc002315508 pc=0x101e28c
github.com/cockroachdb/cockroach/pkg/col/coldata.(*element).copy(0xc00354e000?, {{0x0, 0x15d635b, 0x15d635b}, {0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, ...}, ...)
        github.com/cockroachdb/cockroach/pkg/col/coldata/bytes.go:125 +0x6e fp=0xc0023155b0 sp=0xc002315578 pc=0x101e10e
github.com/cockroachdb/cockroach/pkg/col/coldata.(*Bytes).Copy(...)
        github.com/cockroachdb/cockroach/pkg/col/coldata/bytes.go:236
github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecjoin.(*crossJoinerBase).buildFromLeftInput.func1()
        github.com/cockroachdb/cockroach/bazel-out/k8-fastbuild/bin/pkg/sql/colexec/colexecjoin/crossjoiner.eg.go:850 +0x1d6d fp=0xc002315db8 sp=0xc0023155b0 pc=0x2aab22d
github.com/cockroachdb/cockroach/pkg/sql/colmem.(*Allocator).PerformOperation(0xc002315e20?, {0xc003c2c000, 0x50, 0x2aa1f8b?}, 0xc002315e40)
        github.com/cockroachdb/cockroach/pkg/sql/colmem/allocator.go:460 +0x8f fp=0xc002315e08 sp=0xc002315db8 pc=0x24164af
github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecjoin.(*crossJoinerBase).buildFromLeftInput(0xc004d10b40, {0xc004ad3c00?, 0x62?}, 0x0)
        github.com/cockroachdb/cockroach/bazel-out/k8-fastbuild/bin/pkg/sql/colexec/colexecjoin/crossjoiner.eg.go:47 +0xc5 fp=0xc002315e78 sp=0xc002315e08 pc=0x2aa9465
github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecjoin.(*crossJoiner).Next(0xc004997140)
        github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecjoin/crossjoiner.go:118 +0x1d1 fp=0xc002315ec8 sp=0xc002315e78 pc=0x2aa1871
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).next(...)
        github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:118
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).next-fm()
        <autogenerated>:1 +0x33 fp=0xc002315ee8 sp=0xc002315ec8 pc=0x32f09f3
github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError(0xc004b02b80?)
        github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:92 +0x62 fp=0xc002315f28 sp=0xc002315ee8 pc=0xf59042
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Next(0xc004bea300)
        github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:126 +0x52 fp=0xc002315f70 sp=0xc002315f28 pc=0x32dfa72
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*vectorizedStatsCollectorImpl).Next(0xc004ae9680?)
        <autogenerated>:1 +0x25 fp=0xc002315f88 sp=0xc002315f70 pc=0x32eb625
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*UnorderedDistinct).Next(0xc00329aa00)
        github.com/cockroachdb/cockroach/pkg/sql/colexec/unordered_distinct.go:100 +0x35 fp=0xc002315fb8 sp=0xc002315f88 pc=0x2a1a0f5
github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecdisk.(*diskSpillerBase).Next.func1()
        github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecdisk/disk_spiller.go:202 +0x2f fp=0xc002315fd8 sp=0xc002315fb8 pc=0x2c8baaf
github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError(0xc00293c048?)
        github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:92 +0x62 fp=0xc002316018 sp=0xc002315fd8 pc=0xf59042
github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecdisk.(*diskSpillerBase).Next(0xc004a295f0)
        github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecdisk/disk_spiller.go:200 +0x72 fp=0xc0023160a8 sp=0xc002316018 pc=0x2c8b8b2
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).next(0xc00461ca80)
        github.com/cockroachdb/cockroach/pkg/sql/colexec/materializer.go:247 +0x73 fp=0xc0023160d0 sp=0xc0023160a8 pc=0x2a0e893
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).nextAdapter(...)
        github.com/cockroachdb/cockroach/pkg/sql/colexec/materializer.go:272
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).nextAdapter-fm()
        <autogenerated>:1 +0x2b fp=0xc0023160f0 sp=0xc0023160d0 pc=0x2aa112b
github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError(0x2a5b065?)
        github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:92 +0x62 fp=0xc002316130 sp=0xc0023160f0 pc=0xf59042
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).Next(0xc00461ca80)
        github.com/cockroachdb/cockroach/pkg/sql/colexec/materializer.go:278 +0x4c fp=0xc002316168 sp=0xc002316130 pc=0x2a0ea6c
github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*inputStatCollector).Next(0xc0062ef420)
        github.com/cockroachdb/cockroach/pkg/sql/rowexec/stats.go:69 +0x62 fp=0xc0023161d8 sp=0xc002316168 pc=0x26d9542
github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*noopProcessor).Next(0xc004d10fc0)
        github.com/cockroachdb/cockroach/pkg/sql/rowexec/noop.go:88 +0x46 fp=0xc002316210 sp=0xc0023161d8 pc=0x26c7e66
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Columnarizer).Next(0xc004a55200)
        github.com/cockroachdb/cockroach/pkg/sql/colexec/columnarizer.go:239 +0x13c fp=0xc0023165a0 sp=0xc002316210 pc=0x2a0911c
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).next(...)
        github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:118
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).next-fm()
        <autogenerated>:1 +0x33 fp=0xc0023165c0 sp=0xc0023165a0 pc=0x32f09f3
github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError(0xc004b03100?)
        github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:92 +0x62 fp=0xc002316600 sp=0xc0023165c0 pc=0xf59042
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Next(0xc004bea500)
        github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:126 +0x52 fp=0xc002316648 sp=0xc002316600 pc=0x32dfa72
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*vectorizedStatsCollectorImpl).Next(0xc0047f9b80?)
        <autogenerated>:1 +0x25 fp=0xc002316660 sp=0xc002316648 pc=0x32eb625
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).next(0xc00461cc40)
        github.com/cockroachdb/cockroach/pkg/sql/colexec/materializer.go:247 +0x73 fp=0xc002316688 sp=0xc002316660 pc=0x2a0e893
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).nextAdapter(...)
        github.com/cockroachdb/cockroach/pkg/sql/colexec/materializer.go:272
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).nextAdapter-fm()
        <autogenerated>:1 +0x2b fp=0xc0023166a8 sp=0xc002316688 pc=0x2aa112b
github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError(0xc00315e500?)
        github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:92 +0x62 fp=0xc0023166e8 sp=0xc0023166a8 pc=0xf59042
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).Next(0xc00461cc40)
        github.com/cockroachdb/cockroach/pkg/sql/colexec/materializer.go:278 +0x4c fp=0xc002316720 sp=0xc0023166e8 pc=0x2a0ea6c
github.com/cockroachdb/cockroach/pkg/sql.(*rowSourceToPlanNode).Next(0xc004c2a000, {{0x79516b8?, 0xc0068f3cb0?}, 0xc004bba800?, 0xc00252b2d8?})
        github.com/cockroachdb/cockroach/pkg/sql/row_source_to_plan_node.go:79 +0x45 fp=0xc002316798 sp=0xc002316720 pc=0x38c0a85
github.com/cockroachdb/cockroach/pkg/sql.(*updateNode).BatchedNext(0xc004bba000, {{0x79516b8, 0xc0068f3cb0}, 0xc004bba800, 0xc00252b2d8})
...
msirek commented 1 year ago

Tested a fix for this. With the fix, the query now errors out with:

ERROR: cross-joiner-8-unlimited-11: memory budget exceeded: 2930879872 bytes requested, 3073233619 currently allocated, 3736197120 bytes in budget
SQLSTATE: 53200
cockroach-teamcity commented 1 year ago

roachtest.sqlsmith/setup=seed/setting=no-ddl failed with artifacts on master @ 76a8d1077d3d29bcf020a48972efd1f77aa892c2:

(sqlsmith.go:258).func3: error: pq: internal error: comparison overload not found (is, refcursor, unknown)
stmt:
WITH
    with953 ("coL6461", col6462)
        AS (SELECT oid(4263118396:::OID::OID)::OID AS "coL6461", e'Zlsr\x19qL\x119':::STRING AS col6462),
    with954 (col6463) AS (SELECT * FROM (VALUES (NULL), ('[}Wiilex~':::STRING:::NAME)) AS tab3206 (col6463)),
    "with 955" (col6464)
        AS (SELECT * FROM (VALUES ('':::REFCURSOR), (e'<\x1b9Mys\f':::REFCURSOR)) AS "t%qab3207" (col6464))
SELECT
    "ct""e_ref284".col6464 AS col6465
FROM
    "with 955" AS "ct""e_ref284"
GROUP BY
    "ct""e_ref284".col6464
HAVING
    every(false::BOOL)::BOOL
ORDER BY
    "ct""e_ref284".col6464 ASC NULLS LAST
LIMIT
    99:::INT8;
test artifacts and logs in: /artifacts/sqlsmith/setup=seed/setting=no-ddl/run_1

Parameters: ROACHTEST_arch=amd64 , ROACHTEST_cloud=gce , ROACHTEST_cpu=4 , ROACHTEST_encrypted=false , ROACHTEST_ssd=0

Help

See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7) See: [Grafana](https://go.crdb.dev/roachtest-grafana/teamcity-12098424/sqlsmith-setup-seed-setting-no-ddl/1696746007191/1696746350336)

This test on roachdash | Improve this report!

cockroach-teamcity commented 1 year ago

roachtest.sqlsmith/setup=seed/setting=no-ddl failed with artifacts on master @ 330500d0433bae42bc7c0d6842a427aab3b27f03:

(sqlsmith.go:258).func3: error: pq: internal error: comparison overload not found (is, refcursor, unknown)
stmt:
WITH
    "wi😤t h1284" (" col😽8829")
        AS (
            SELECT
                *
            FROM
                (
                    VALUES
                        (ARRAY[(-22913):::INT8,25064:::INT8,7937:::INT8,(-16897):::INT8]),
                        (ARRAY[(-18977):::INT8,1975:::INT8,12572:::INT8,(-24605):::INT8]),
                        (ARRAY[(-18389):::INT8,(-17197):::INT8,(-24321):::INT8,(-6604):::INT8]),
                        (ARRAY[(-27738):::INT8,(-31091):::INT8,29852:::INT8,31551:::INT8]),
                        (ARRAY[(-574):::INT8,11169:::INT8,26097:::INT8]),
                        (ARRAY[7428:::INT8])
                )
                    AS ẗ́ab4239 (" col😽8829")
        ),
    "w\fith1285" ("col  8830")
        AS (SELECT * FROM (VALUES (e'\x00':::REFCURSOR), (e'*\x193k8_pv(':::REFCURSOR)) AS "t a%pb4240" ("col   8830")),
    "wit😯h""1286" (col8831)
        AS (
            SELECT
                *
            FROM
                (
                    VALUES
                        (octet_length(NULL::STRING)::INT8), ((-758985217467297209):::INT8), (6563355101037440003:::INT8)
                )
                    AS "t ab4241" (col8831)
        )
SELECT
    e'Q\x06h^0R\x02':::REFCURSOR AS "{%vcol8832"
FROM
    "w\fith1285" AS cte_ref362
ORDER BY
    cte_ref362."col 8830",
    cte_ref362."col 8830" DESC NULLS FIRST,
    cte_ref362."col 8830" ASC,
    cte_ref362."col 8830" NULLS FIRST,
    cte_ref362."col 8830"
LIMIT
    98:::INT8;
test artifacts and logs in: /artifacts/sqlsmith/setup=seed/setting=no-ddl/run_1

Parameters: ROACHTEST_arch=amd64 , ROACHTEST_cloud=gce , ROACHTEST_cpu=4 , ROACHTEST_encrypted=false , ROACHTEST_ssd=0

Help

See: [roachtest README](https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md) See: [How To Investigate \(internal\)](https://cockroachlabs.atlassian.net/l/c/SSSBr8c7) See: [Grafana](https://go.crdb.dev/roachtest-grafana/teamcity-12106584/sqlsmith-setup-seed-setting-no-ddl/1696832288834/1696832852911)

This test on roachdash | Improve this report!

yuzefovich commented 1 year ago

This only affects the cases when we have very wide rows (on the order of 10s of MBs, maybe 1s of MBs). We do already perform memory accounting after the fact, but with wide rows (and especially when the size of rows varies significantly, see https://github.com/cockroachlabs/support/issues/2625#issuecomment-1769847579 for a similar example), we might end up including up to 1024 wide rows in the output batch.

Mark already started working on this over in #111668, and I think it'd take on the order of a day to push this over the finish line.

DrewKimball commented 1 year ago

See #113272 for another example.

DrewKimball commented 1 year ago

We'd like to look at this in the next month.

yuzefovich commented 1 year ago

Here is another example that we should fix:

CREATE TABLE t (s STRING);
INSERT INTO t SELECT repeat('a', 50) FROM generate_series(1, 1000);
ANALYZE t;
EXPLAIN ANALYZE SELECT concat_agg(t1.s ORDER BY t1.s DESC) FROM t AS t1, t AS t2;
yuzefovich commented 11 months ago

We plan to address this soonish, so it's P-2 by intention, but because the issue was open long time ago, it would show up as having breached the SLA, so I flipped to P-3.