sprintf panics when given malformed format strings and a non-zero arguments list

philipaconrad commented 2 years ago

Short description

sprintf now can panic when given malformed format strings. However, this is a difficult panic to invoke outside of direct use of the parser/evaluation functions in Golang.

Examples:

OPA version: edge (commit ~~50b79b9~~ 2fbc99c, several PRs after the v0.41.0 tag)

Example output that OPA returned:

runtime error: index out of range [0] with length 0

Steps To Reproduce

Add a new test case to the end of test/cases/testdata/sprintf/test-sprintf.yaml:

- data:
modules:
- |
package test

p = x {
  exampleval:= 0
  x = sprintf("%+v",[exampleval])
}
note: sprintf/array/regression 1
query: data.test.p = x
want_result:
- x: "0"

Observe the panic:


--- FAIL: TestRego (1.13s)
--- FAIL: TestRego/sprintf/array/regression_1 (0.00s)
panic: runtime error: index out of range [0] with length 0 [recovered]
panic: runtime error: index out of range [0] with length 0

goroutine 1124 [running]: testing.tRunner.func1.2({0xb39580, 0xc0007fcc78}) /usr/local/go/src/testing/testing.go:1389 +0x24e testing.tRunner.func1() /usr/local/go/src/testing/testing.go:1392 +0x39f panic({0xb39580, 0xc0007fcc78}) /usr/local/go/src/runtime/panic.go:838 +0x207 github.com/open-policy-agent/opa/topdown.builtinSprintf({0xcbd418?, 0xc000dfdfe0?}, {0xcbcf18?, 0xc000c2d140?}) /github-devel-folder/opa/topdown/strings.go:507 +0x41c github.com/open-policy-agent/opa/topdown.functionalWrapper2.func1({{0xcbc150, 0xc0000240c0}, {0xcbe1a0, 0xc0008114e0}, {0xcb77e0, 0xc00007e2d0}, 0xc00075f710, {0x0, 0x0}, 0x0, ...}, ...) /github-devel-folder/opa/topdown/builtins.go:146 +0x73 github.com/open-policy-agent/opa/topdown.evalBuiltin.eval({0xc00109b4a0, 0x10aaa00, {{0xcbc150, 0xc0000240c0}, {0xcbe1a0, 0xc0008114e0}, {0xcb77e0, 0xc00007e2d0}, 0xc00075f710, {0x0, ...}, ...}, ...}, ...) /github-devel-folder/opa/topdown/eval.go:1686 +0x33f github.com/open-policy-agent/opa/topdown.(eval).evalCall(0xc00109b4a0, {0xc0002e0ab0, 0x4, 0x6}, 0xc0002fcc60) /github-devel-folder/opa/topdown/eval.go:813 +0xa85 github.com/open-policy-agent/opa/topdown.(eval).evalStep(0xc00109b4a0, 0xc000d70d50) /github-devel-folder/opa/topdown/eval.go:358 +0x75c github.com/open-policy-agent/opa/topdown.(eval).evalExpr(0xc00109b4a0, 0xc0002fb7a0) /github-devel-folder/opa/topdown/eval.go:332 +0xec github.com/open-policy-agent/opa/topdown.(eval).next(...) /github-devel-folder/opa/topdown/eval.go:164 github.com/open-policy-agent/opa/topdown.(eval).evalExpr.func1(0xc00109b4a0) /github-devel-folder/opa/topdown/eval.go:333 +0x29 github.com/open-policy-agent/opa/topdown.(eval).evalStep.func1() /github-devel-folder/opa/topdown/eval.go:353 +0x39 github.com/open-policy-agent/opa/topdown.(*eval).biunifyValues(0xc00109b4a0, 0xc000b85e78, 0xc000b85e90, 0xc0002fcab0, 0xc0002fcab0, 0xc0002fcba0) ...



## Expected behavior

`sprintf` should not `panic` when confronted with malformed format strings; ideally, it should just print them as a normal string, and ignore the arguments list parts that don't have a 1-to-1 match with the format string.

## Bug investigation

Internally, this `panic` seems to be caused by an iteration over 2x arrays, where one array's size is dependent on the length of the array of arguments handed to `sprintf`, and the other array's length is determined by how many formatting args were detected via a regex. In the case of malformed format strings, there can be a mismatch in size between these two arrays, and the `sprintf` builtin explodes with a `panic` related to array-OOB accesses because one array is shorter than the other. ([Link to source location generating the `panic`](https://github.com/open-policy-agent/opa/blob/main/topdown/strings.go#L505-L511))

## Additional context
This bug was uncovered while I was running a backtest suite as part of verifying my changes while trying to fix #4672. The two issues are wholly unrelated, and this issue was only discovered because my feature branch tracked OPA edge, while the reference tests were tracking the `v0.41.0` tag. The sudden occurrence of mystery panics helped me find this bug.

This bug appears to have been introduced by PR #4620 (merged after the `v0.41.0` release), which substantially changed how `sprintf` wrangles its formatting arguments.

srenatus commented 2 years ago

    // Structs formatted with %v show field values in their default formats.
    // The %+v form shows the fields by name, while %#v formats the struct in
    // Go source format.
    person := struct {
        Name string
        Age  int
    }{"Kim", 22}
    fmt.Printf("%v %+v %#v\n", person, person, person)
    // Result: {Kim 22} {Name:Kim Age:22} struct { Name string; Age int }{Name:"Kim", Age:22}

(source)

Hmm %+v and %#v format verbs will always leak implementation details, and I'm not sure I know a single valid use. We should deal with them in some way, and keep backwards-compatibility to some extend. I think overriding %+v and %#v to %v would be reasonable. WDYT?

philipaconrad commented 2 years ago

I'm fine with that decision. It will require some reworking of how sprintf currently handles verbs, but that was going to have to happen to resolve the bug anyway. :sweat_smile:

@srenatus Do we have a formal list anywhere of what verbs / verb forms we want sprintf to handle? :thinking:

I'm thinking if we have an exact set of verbs we want to support, that could make this work somewhat easier, as we can then take verbs not on that list that are found during parsing, and rewrite them into escaped literals, so that fmt.Sprintf won't process them as verbs.

philipaconrad commented 2 years ago

Currently, it looks like our tests cover:

Verbs:

%d
%f
%s
%v

Verb forms:

%<verb> (ex: %s)
%<prefix><verb> (ex: %.1f)
%[<idx>]<verb> (ex: %[2]d)
%<prefix>[<idx>]<verb> (ex: %.3[1]f)

srenatus commented 2 years ago

This is great. We should definitely "own" the list of supported verbs instead of referring to the golang docs.

Ideally, we'd fall back to something for those we don't support so we don't break stuff for anyone... 🤔

philipaconrad commented 2 years ago

@srenatus I looked over the fmt docs, and here is my proposed support list/rationales for formatting verbs. Let me know what you think!

Design concerns

Of all the verbs booted from the list, I think %T (prints type-of-value) might be worth keeping for debugging purposes alone. :thinking:
Unicode is here to stay, but not all programming language environments have good support for it. That said, there could be a very strong argument for bringing unicode-related formatting directives back onto the list, since OPA deals heavily with processing strings that might often be unicode!

Proposed fallback behavior

Discard the %. (Encourages consistent golang-style %% escaping.)
Escape the "verb", so that the underlying fmt.Sprintf call treats the verb characters as a raw string. (Discourages trying to abuse the underlying implementation details.)

Proposed verb list

Rationale: The formatting directives (or types of directives) below are well-supported across several modern programming languages (e.g. Python, Ruby), or could readily be implemented from one of the other directives included in the list. We want to hit a useful, common-case subset of behavior, since we can always add more stuff to it later, but taking things away from the list will often hurt users. This may also help implementation support in other environments, such as WASM, by reducing the scope of the formatting support required.

Following the style from the Golang fmt docs:

General:

%v  the value in a default format
%T  the (Rego) type of the value
%%  a literal percent sign; consumes no value

Dropped Golang-specific directives. (~~%T~~, %#v)
- Added back the %T directive because of @srenatus 's suggestion.

Boolean: %t the word true or false

Integer:

%b  base 2
%d  base 10
%o  base 8
%O  base 8 with 0o prefix

Dropped character/unicode support verbs. (%c, %q, %U)
Dropped hexadecimal support verbs. (%x, %X)
- (See comment thread below for discussion.)

Floating-point and complex constituents:

%e  scientific notation, e.g. -1.234456e+78
%E  scientific notation, e.g. -1.234456E+78
%f  decimal point but no exponent, e.g. 123.456
%F  synonym for %f
%g  %e for large exponents, %f otherwise. Precision is discussed below.
%G  %E for large exponents, %F otherwise

Dropped unusual/not-commonly-supported verbs. (%b)
Dropped character/unicode support verbs. (%c, %q, %U)
Dropped hexadecimal support verbs. (%x, %X)
- (See comment thread below for discussion.)

String and slice of bytes (treated equivalently with these verbs):

%s  the uninterpreted bytes of the string or slice
%q  a double-quoted string safely escaped with Go syntax

Dropped Golang-specific verbs. (%q)
Dropped unicode-specific verbs. (%x, %X)

Slice:

Dropped entirely; implementation detail leakage. (%p)

Pointer:

Dropped entirely; implementation detail leakage. (%p)

Flags support

Keep existing flags behavior for all current flags. (+, -, #, ' ', 0)
Keep existing precision spec behavior. (Ex: %.3f)

Explicit argument indexes

Keep existing Golang behavior (emulating it if necessary).
- The "quoted literal from arg index" syntax (Ex: fmt.Sprintf("%[2]*.2f", 6, 12.2)) thankfully only admits integers in the quoted spot, so this won't be too awful to support.

Format errors

Keep existing Golang behavior, even though it leaks implementation details. (Probably too much work to work around; can be documented as "platform-dependent behavior".)

srenatus commented 2 years ago

Of all the verbs booted from the list, I think %T (prints type-of-value) might be worth keeping for debugging purposes alone.

It doesn't need to happen right away, but printing the rego type for %T would be nice.

srenatus commented 2 years ago

Looks like %T isn't as bad as I thought -- we're not leaking golang struct types -- but it's still got some things to fix:

call	output	expected
'sprintf("%T", [true])'	"string"	"boolean"
'sprintf("%T", [1])'	"int"	"number"
'sprintf("%T", [1.0])'	"float64"	"number"
'sprintf("%T", ["1"])'	"string"	✔️
'sprintf("%T", [[]])'	"string"	"array"
'sprintf("%T", [{}])'	"string"	"object"
'sprintf("%T", [set()])'	"string"	"set"

philipaconrad commented 2 years ago

@srenatus, @anderseknert I think I've just run into another case where #4797 has come back to bite us.

The %x/%X directives have different meaning for floats versus integers, and this is a problem because we freely switch between the two representations for Number types behind the scenes, but there are cases where we imply one representation or the other; such as the bitwise builtins (ints), the number builtins (ints, floats), and the units builtins (ints, floats).

I'm honestly not sure how to handle this collision case for our Number type. :slightly_frowning_face:

Here are some of the options I've considered though:

We could remove %x/%X entirely. (Removes the problem case, but also a very useful formatting option for bitwise operations. :disappointed: )
We could force one behavior or the other to be used for every instance of %x/%X (The "pick one" option. Still axes useful functionality, and may subvert user expectations if they plugged in a "float", but get an "integer" %x/%X formatting. :shrug: )
We could create new/variant formatting directives for each case. (Requires more advanced rewriting of formatting directives under-the-hood, and would create a maintenance burden into the future. :sweat: Potentially very gross, but I had to mention it.)

Thoughts?

srenatus commented 2 years ago

We could remove %x/%X entirely. (Removes the problem case, but also a very useful formatting option for bitwise operations. 😞 )

Those bitwise ops would be using integers, only, though? I'm wondering if "pick one" with int wouldn't be the best option here. But that might be my bias never having to use floats much.

anderseknert commented 2 years ago

Just some thoughts from the hammock here, but let’s not forget that options that might be very useful for bitwise operations, aren’t necessarily very useful for policy authoring in general. The number of policies in the wild using the bitwise functions ought to be somewhere close to a handful, and the number of policies also pretty-printing these closer to zero.

The fixes previously made to handling float/int presentation with sprintf however came from real problems reported by end-users. If we need to prioritize one over the other, it should be an easy choice.

philipaconrad commented 2 years ago

There's also an option 4, which is ... imitate what JS does when it's confronted with bitwise operations: convert to a 32-bit signed integer, and go (Ex: Bitwise OR docs).

This is, admittedly, kind of silly behavior, but it's what JS does. :man_shrugging:

philipaconrad commented 2 years ago

I think for now, if there are no objections, the best path forward is to strike %x/%X from the supported verbs list.

If someone complains about wanting this kind of formatting support, we can revisit this decision later, and try to do something reasonable. (Easier to drop now and add it back later, than to add something half-baked.)

Thank you for your inputs, @anderseknert and @srenatus!

philipaconrad commented 2 years ago

Integers do seem a little silly though without hexadecimal support, since we support other bases (2, 8, 10). :thinking:

srenatus commented 2 years ago

x := sprintf("%x", [100])
y := sprintf("%x", [100.0])

⬇️

{
    "x": "64",
    "y": "0x1.9p+06"
}

I'm an ignorant, I have no idea what y means. 😅

I think striking is fine, but it would be preferable if there was some advise to hand out to anyone needing it. Do we have other means to get the hexadecimal of an integer? 🤔

charlesdaniels commented 2 years ago

My $0.02 is that we should either drop support for integers entirely, or we should support every common radix (2, 8, 10, 16). I think simply converting floats to the nearest integer before feeding them to these formatting verbs is a reasonable approach, and is the behavior I would expect if I was an end user and had not read the docs.

If we don't think that's reasonable, we should drop %b, %d, %o, and %O to be consistent, then people can screw with %f to get the right number of decimal places for their use case.

A third option would be to just print the floats in the alternative radixes. FWIW, C++ defines a syntax for hexadecimal floats (source).

stale[bot] commented 2 years ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.

open-policy-agent / opa