cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.1k stars 3.81k forks source link

server: the util/pretty code is incorrectly exposed through user-facing features #91197

Open knz opened 2 years ago

knz commented 2 years ago

Description

The util/pretty code was designed for use in one-off commands (like a command line program), not to be served over the network.

In particular:

This makes this code profoundly inadequate for inclusion in CRDB in a way that can be triggered by users through an API.

I've already identified the following places where this code is used:

Expected behavior

  1. The code in util/pretty is modified/enhanced to support memory monitors and recognize query cancellation. This is absolutely required at least for the use in prettify_statement() where the user is in control.
  2. the callers for stmt bundles / combined statements stats avoid the combination of parameters that result in super-quadratic behavior
  3. the callers also define a timeout on the length of the operation (via context cancellation)

Jira issue: CRDB-21145

knz commented 2 years ago

Additionally, this code in server/combined_statement_stats.go:

  query = fmt.Sprintf(
    `SELECT prettify_statement($1, %d, %d, %d)`,
    tree.ConsoleLineWidth, tree.PrettyNoAlign, tree.UpperCase)
  row, ⊙ = ie.QueryRowEx(⋄, "combined-stmts-details-format-query", ∅,
    sessiondata.InternalExecutorOverride{
      User: username.NodeUserName(),
    }, query, args…)

can be simplified to just a call to the Go function tree.Pretty() cc @maryliag

maryliag commented 2 years ago

@knz the example above can't use the tree.Pretty() because I don't actually have the statement in a node tree format at that point. The prior steps is retrieving the query from the database, which returns a string, so I would need to convert to NodeFormatter to be able to use that function. So to improve the usage on combined statements and statement UI, I will update the parameter of alignment from 2 to 1.

I'll leave the improvements of the function to the queries team

ajwerner commented 2 years ago

so I would need to convert to NodeFormatter to be able to use that function.

You ought to be able to use parser.ParseOne to turn a statement into a parser.Statement and AST is a tree.Statement which implements NodeFormatter.

ajwerner commented 2 years ago

Oh, I don't know what I'm talking about, prettyStatement does exactly the right thing.

https://github.com/cockroachdb/cockroach/blob/08b815bfb48f8eace90ee718f65da5def4ab6954/pkg/sql/sem/builtins/builtins.go#L10026-L10040

knz commented 2 years ago

if you know there's just 1 stmt, ParseOne is adequate

maryliag commented 2 years ago

Options:

Since I do want different parameters than the default (I want different value for the line width and I do want some alignment. and not the PrettyNoAlign), I will keep the SQL function, with the adjustments of the parameters

knz commented 2 years ago

(I want different value for the line width and I do want some alignment. and not the PrettyNoAlign

You do not need the SQL function for that. You can do this:

cfg := tree.DefaultPrettyCfg()
cfg.XX = YY
cfg.ZZ = WW
cfg.Pretty(<yourstmt>)

let's avoid the layers of indirections through SQL executor which are not needed here.

rytaft commented 1 year ago

Is there any urgent action needed from SQL Queries for this issue? Drew mentions that we may want to limit the number of rows returned

knz commented 1 year ago

i tagged sql queries to tweak the parameter used in stmt bundles. but maybe marylia already handled it?

maryliag commented 1 year ago

I changed the usages from PrettyAlignAndDeIndent to PrettyAlignOnly, since the issues describes that the bundle uses PrettyNoAlign and that it was fine, I didn't touch it

rytaft commented 1 year ago

Ok thanks! Removing SQL Queries for now, since it seems like statement bundles are ok

maryliag commented 1 year ago

From a SQL Observability perspective, the usage of the function was updated to not use the option that was causing the OOM issues.

I can see the label for sql-queries was added, so I'll remove sql-obs and leave the queries team to work in improvements to prevent the OOM occurring (by preventing users to choose the options that caused OOM, fixing the issue itself, or any other option the team sees fit).

mgartner commented 1 year ago

I think we should explore getting rid of the code that has pathological performance problems.

maryliag commented 1 year ago

This issue is still causing OOM for some users. On #99450 I removed all usages on Cluster Observability pages, but some pages still use values coming from Explain (@cockroachdb/sql-queries ) and Sessions (@cockroachdb/sql-sessions ) that use Pretty. So this issue should focus on removing those usages or improving how it works.

blathers-crl[bot] commented 1 year ago

Hi @mgartner, please add branch-* labels to identify which branch(es) this release-blocker affects.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

mgartner commented 1 year ago

Adding a release-blocker label because I think this should be discussed further before the next release.

rafiss commented 1 year ago

We can add a disallowed_import once we remove the usages. (And also add it some places right now.)

disallowed_imports_test(
    "tree",
    disallowed_list = [
        "//pkg/util/pretty",
    ],
    # or maybe this way?
    disallowed_prefixes = [
        "pkg/util/pretty",
    ],
)
michae2 commented 1 year ago
  • statement bundles (uses PrettyNoAlign, somewhat fine)

Even without the exponential blowup, it looks like this usage can cause stack overflows when collecting statement diagnostics bundles on queries with very large IN sets. For example, something like this can cause a stack overflow:

CREATE TABLE foo (id INT PRIMARY KEY, v STRING);
SELECT crdb_internal.request_statement_bundle('SELECT * FROM foo WHERE id IN (_, _, __more1000_plus__)', 0.0, '0', '0');
SELECT * FROM foo WHERE id IN (0, 1, 2, 3, ..., 999998, 999999);

Here's a demonstration using python:

# michae2@michae2-crl-mbp ~crdb % python <<EOF | cockroach demo
print("CREATE TABLE foo (id INT PRIMARY KEY, v STRING);")
print("SELECT crdb_internal.request_statement_bundle('SELECT * FROM foo WHERE id IN (_, _, __more1000_plus__)', 0.0, '0', '0');")
print("SELECT * FROM foo WHERE id IN (")
print(*range(1000000), sep=", ")
print(");")
EOF

CREATE TABLE

Time: 3ms

  crdb_internal.request_statement_bundle
------------------------------------------
                    t
(1 row)

Time: 4ms

runtime: goroutine stack exceeds 1000000000-byte limit
runtime: sp=0xc053780380 stack=[0xc053780000, 0xc073780000]
fatal error: stack overflow

runtime stack:
runtime.throw({0x105df1916?, 0x10a833da0?})
    GOROOT/src/runtime/panic.go:1047 +0x5d fp=0x70000f8f6d78 sp=0x70000f8f6d48 pc=0x10003dd1d
runtime.newstack()
    GOROOT/src/runtime/stack.go:1105 +0x5bd fp=0x70000f8f6f28 sp=0x70000f8f6d78 pc=0x10005877d
runtime.morestack()
    src/runtime/asm_amd64.s:574 +0x8b fp=0x70000f8f6f30 sp=0x70000f8f6f28 pc=0x100073a8b

goroutine 4062 [running]:
runtime.mapaccess2(0x1054271e0, 0xc0053fc540, 0xc0537803d0?)
    GOROOT/src/runtime/map.go:456 +0x217 fp=0xc053780390 sp=0xc053780388 pc=0x100014e17
github.com/cockroachdb/cockroach/pkg/util/pretty.(*beExec).iDoc(0xc07377d590, {0xe62b?, 0xf8?}, {0x1073e0a80?, 0xc04edd8c20?}, 0x10d900108?)
    github.com/cockroachdb/cockroach/pkg/util/pretty/pretty.go:210 +0x72 fp=0xc053780400 sp=0xc053780390 pc=0x100f8fc52
github.com/cockroachdb/cockroach/pkg/util/pretty.(*beExec).be(0xc07377d590, {0xe62b?, 0xf8?}, 0xc013b19fa0)
    github.com/cockroachdb/cockroach/pkg/util/pretty/pretty.go:132 +0x145 fp=0xc0537805e0 sp=0xc053780400 pc=0x100f8ee25
github.com/cockroachdb/cockroach/pkg/util/pretty.(*beExec).be(0xc07377d590, {0xe62b?, 0xf8?}, 0xc013b19fe0)
    github.com/cockroachdb/cockroach/pkg/util/pretty/pretty.go:141 +0x2c5 fp=0xc0537807c0 sp=0xc0537805e0 pc=0x100f8efa5
github.com/cockroachdb/cockroach/pkg/util/pretty.(*beExec).be(0xc07377d590, {0xe62b?, 0xf8?}, 0xc013b1a000)
    github.com/cockroachdb/cockroach/pkg/util/pretty/pretty.go:141 +0x2c5 fp=0xc0537809a0 sp=0xc0537807c0 pc=0x100f8efa5
github.com/cockroachdb/cockroach/pkg/util/pretty.(*beExec).be(0xc07377d590, {0xe62b?, 0xf8?}, 0xc013b19fc0)
    github.com/cockroachdb/cockroach/pkg/util/pretty/pretty.go:132 +0x188 fp=0xc053780b80 sp=0xc0537809a0 pc=0x100f8ee68
...
knz commented 1 year ago

The question however is how realistic that is in practice?

Unfortunately this came up in practice (see link above) so at least somewhat realistic.

knz commented 1 year ago

Making this better will be a non-trivial programming exercise. The power of the current framework comes directly from the conciseness of the rendering functions. Making this better by adding an extra mandatory argument (to count the call depth), or even worse by attempting to "flatten" the control flow by moving the recursion to a loop, would make the code unmaintainable and unextensible.

My recommendation here would be to use code generation. Implement the rules in some kind of DSL then auto-generate the recursive code with the extra counting data structure passed explicitly by the code generator.

mgartner commented 1 year ago

Making this better will be a non-trivial programming exercise.

I came up with #110374, which catches the case @michae2 described, but curious what you think.

mgartner commented 1 year ago

I've also created #110375 because I don't believe we strictly need to pretty-print statements in statement bundles.

knz commented 1 year ago

I came up with https://github.com/cockroachdb/cockroach/pull/110374, which catches the case @michae2 described, but curious what you think.

:+1: let's continue the convo over there.