roc-lang / roc

A fast, friendly, functional language.
https://roc-lang.org
Universal Permissive License v1.0
4.19k stars 299 forks source link

llvm alloca unitentional mutation bug #6139

Closed joshi-monster closed 2 months ago

joshi-monster commented 10 months ago

I also posted this on the chat, but I think actually making an issue sounds like a good idea :slightly_smiling_face:

I tried to make this example simpler by using lists instead of strings, but could not reproduce it with a lists version. So it seems to maybe also be related to the specifics of how strings are handled? Unfortunately I have have no idea about roc's internals to make a good guess what might be happening here, since I just started actually writing roc code.

prefixes = \str, soFar ->
    if Str.isEmpty str then
        soFar

    else
        next = str
            |> Str.graphemes
            |> List.dropFirst 1
            |> Str.joinWith ""

        prefixes next (List.append soFar str)

# this is the (erroneous) output roc produces
expect prefixes "abc" [] == ["abc", "c", ""]

# this is what I whould have expected
expect prefixes "abc" [] == ["abc", "bc", "c"] # fails

If it helps, I could repro this with todays nighly build and an old one I still had from 11/28. It also did not make a difference whether I used basic-cli 0.6.2 or 0.7.0.

bhansconnect commented 10 months ago

Some extra context from some preliminary debugging: https://roc.zulipchat.com/#narrow/stream/395097-compiler-development/topic/spooky.20action.20at.20a.20distance/near/405401898

joshi-monster commented 10 months ago

Maybe the example with "abc" was not ideal; When using longer strings (long enough to definitely not be small-string optimized), it always drops 2 characters after the first iteration, and then also has an empty string at the end.

So

# note the second and last element; all other strings are as expected
# the same as the previous, but with the first char dropped
expect prefixes "abc...xyz" [] == ["abc...xyz", "c...xyz", ..., "xyz", "yz", "z", ""]

Which seems to indicate that after the first iteration (probably because it copied it to be able to mutate), it inserts into the list first, but then in-place mutates the string?

bhansconnect commented 10 months ago

"abc...xyz"

This is still a small string. Has to be at least 24 characters to be a large string.

joshi-monster commented 10 months ago

The ... was meant to represent an arbitrary number or characters; sorry if that wasn't clear.

In practice, I used all lowercase letters, uppercase letters, and digits (so Str.countGraphemes == 62)

bhansconnect commented 10 months ago

Ah ok. makes sense. The large string case doesn't surprise me that it breaks. The small string case also breaking is what really scares me. Small strings are never shared or reference counted (just copied), yet will still see the sudden change in data. On top of that, first scans of the LLVM IR looked just fine. I probably need to step through the assembly and work backwards from that.

bhansconnect commented 10 months ago

Self contained repro with some handy dbgs sprinkled in. Just run with roc test Bug.roc

interface Bug
    exposes []
    imports []

prefixes = \str, soFar ->
    dbg (str, soFar)
    if Str.isEmpty str then
        soFar

    else
        graphemes = Str.graphemes str
        dbg (str, graphemes)
        remaining = List.dropFirst graphemes 1
        dbg (str, remaining)
        next = Str.joinWith remaining ""
        dbg (str, next)

        prefixes next (List.append soFar str)

# this is what I would have expected
expect
    out = prefixes "abc" []
    out == ["abc", "bc", "c"] # fails

Output:

────────────────────────────────────────────────────────────────────────────────
[/tmp/Bug.roc:5] (str, soFar) = ("abc", [])
[/tmp/Bug.roc:11] (str, graphemes) = ("abc", ["a", "b", "c"])
[/tmp/Bug.roc:13] (str, remaining) = ("abc", ["b", "c"])
[/tmp/Bug.roc:15] (str, next) = ("abc", "bc")
[/tmp/Bug.roc:5] (str, soFar) = ("bc", ["abc"])
[/tmp/Bug.roc:11] (str, graphemes) = ("bc", ["b", "c"])
[/tmp/Bug.roc:13] (str, remaining) = ("bc", ["c"])
[/tmp/Bug.roc:15] (str, next) = ("c", "c")
[/tmp/Bug.roc:5] (str, soFar) = ("c", ["abc", "c"])
[/tmp/Bug.roc:11] (str, graphemes) = ("c", ["c"])
[/tmp/Bug.roc:13] (str, remaining) = ("c", [])
[/tmp/Bug.roc:15] (str, next) = ("", "")
[/tmp/Bug.roc:5] (str, soFar) = ("", ["abc", "c", ""])
── EXPECT FAILED ──────────────────────────────────────────────── /tmp/Bug.roc ─

This expectation failed:

20│>  # this is what I whould have expected
21│>  expect
22│>      out = prefixes "abc" []
23│>      out == ["abc", "bc", "c"] # fails

When it failed, these variables had these values:

out : List Str
out = ["abc", "c", ""]

1 failed and 0 passed in 510 ms.

The failure is clearly seen here: [/tmp/Bug.roc:15] (str, next) = ("c", "c") Suddenly, after calling Str.joinWith remaining "" for the second time, str has lost a character.

bhansconnect commented 10 months ago

So far really not sure what is going on. The llvm ir that I have read looks fine. Debug prints from zig look fine as well. The string is small and being copied so it shouldn't be some sort of referencing issue. We are passing strings around by reference technically, so the bug could be related to that somehow and accidentally mutating it, but I'm not convinced.

Bug manifests after Str.joinWith, but I am not fully sure it is the cause....hard to root cause this.

Replaced Str.graphemes with Str.toUtf8 and some extra mapping and still hit the error, so that isn't the cause.

joshi-monster commented 10 months ago

Since you helped me get the debugger running, I thought I might as well run it on this :smile:

I've stared at the assembly and the IR for some time now, and I think I might have finally found something?

But I'm also really tired and I've looked at this for way too long. Maybe I'm missing something here.

The IR for me:
```llvm ; prefixes = \str soFar -> define internal fastcc %list.RocList @"#UserApp_prefixes_676ec9e417566a851359c2c6d5d5332f7d40742f8274a8672f3cad244846"(ptr %"24", %list.RocList %"25") { entry: %result_value = alloca %str.RocStr, align 8 %const_str_store = alloca %str.RocStr, align 8 br label %joinpointcont joinpointcont: ; preds = %else_block, %entry %joinpointarg = phi ptr [ %"24", %entry ], [ %result_value, %else_block ] ; this is str on entry, or the result of (... |> Str.joinWith "") %joinpointarg1 = phi %list.RocList [ %"25", %entry ], [ %call4, %else_block ] ; this is soFar on entry, or the result of (List.append soFar str) on recursion ; if Str.isEmpty str then %call = call fastcc i1 @Str_isEmpty_7f7e162ee4345c12acb2c8dddfd129c8c9ef562ecb31841cfff13d4789ffc2(ptr %joinpointarg) br i1 %call, label %then_block, label %else_block then_block: ; preds = %joinpointcont call fastcc void @"#Attr_#dec_1"(ptr %joinpointarg) ; recount-1 str OR $result_value ret %list.RocList %joinpointarg1 ; return soFar else_block: ; preds = %joinpointcont %call2 = call fastcc %list.RocList @Str_graphemes_cb411178cb7686889a4ee0e4b4c57e63975186dc9f1448b79e94c2721a21a2(ptr %joinpointarg) ; str |> Str.graphemes %call3 = call fastcc %list.RocList @List_dropFirst_54b3c6d264e7c557f2fe74ef29431163e9a30a2e4aca38b681d4b9ee62de031(%list.RocList %call2, i64 1) ; |> List.dropFirst 1 ; this block probably generates an empty string in %const_str_store? store ptr null, ptr %const_str_store, align 8 %const_str_store.repack5 = getelementptr inbounds %str.RocStr, ptr %const_str_store, i64 0, i32 1 store i64 0, ptr %const_str_store.repack5, align 8 %const_str_store.repack6 = getelementptr inbounds %str.RocStr, ptr %const_str_store, i64 0, i32 2 store i64 -9223372036854775808, ptr %const_str_store.repack6, align 8 call fastcc void @Str_joinWith_cabb163ea8b383114bab450f2ea4bdf6f97d5dc22e57b593db81e3bce47(%list.RocList %call3, ptr nonnull %const_str_store, ptr nonnull %result_value) ; |> Str.joinWith "" - here, the last argument is an out argument? call fastcc void @"#Attr_#dec_1"(ptr nonnull %const_str_store) ; refcount-1 "" call fastcc void @"#Attr_#dec_2"(%list.RocList %call3) ; refcount-- (tmp result of dropFirst) %call4 = call fastcc %list.RocList @List_append_e6845638e158b704aa8395d259110713932beb5d7a34137f5739ba7e3dd198(%list.RocList %joinpointarg1, ptr %joinpointarg) ; (List.append soFar str OR next) ?? br label %joinpointcont } ```
bhansconnect commented 9 months ago

So on the first iteration, it correctly appends str, but on every following iteration, it instead appends the result of the next instead!

When it recurses here prefixes next (List.append soFar str), it should be setting str to next. So I think that joinpoint behavior is correct.....

Oh, it is a double use of result value, that is definite the issue. So when we loop, we overwrite joinpointarg when writing to result_value

bhansconnect commented 9 months ago

Thanks so much for finding this 🙏

bhansconnect commented 9 months ago

Putting this in our own ir:

procedure : `Bug.prefixes` List Str
procedure = `Bug.prefixes` (`#Derived_gen.IdentId(29)`: Str, `#Derived_gen.IdentId(30)`: List Str):
    joinpoint `Bug.20` `Bug.str` `Bug.soFar`:
        inc `Bug.soFar`;
        inc `Bug.str`;
        let `Bug.30` : {Str, List Str} = Struct {`Bug.str`, `Bug.soFar`};
        let `Bug.9` : Str = CallByName `Inspect.toStr` `Bug.30`;
        dbg `Bug.9`;
        dec `Bug.9`;
        let `Bug.28` : Int1 = CallByName `Str.isEmpty` `Bug.str`;
        if `Bug.28` then
            dec `Bug.str`;
            ret `Bug.soFar`;
        else
            inc 3 `Bug.str`;
            let `Bug.graphemes` : List Str = CallByName `Str.graphemes` `Bug.str`;
            inc `Bug.graphemes`;
            let `Bug.27` : {Str, List Str} = Struct {`Bug.str`, `Bug.graphemes`};
            let `Bug.8` : Str = CallByName `Inspect.toStr` `Bug.27`;
            dbg `Bug.8`;
            dec `Bug.8`;
            let `Bug.26` : U64 = 1i64;
            let `Bug.remaining` : List Str = CallByName `List.dropFirst` `Bug.graphemes` `Bug.26`;
            inc `Bug.remaining`;
            let `Bug.25` : {Str, List Str} = Struct {`Bug.str`, `Bug.remaining`};
            let `Bug.7` : Str = CallByName `Inspect.toStr` `Bug.25`;
            dbg `Bug.7`;
            dec `Bug.7`;
            let `Bug.24` : Str = "";
            let `Bug.next` : Str = CallByName `Str.joinWith` `Bug.remaining` `Bug.24`;
            dec `Bug.24`;
            dec `Bug.remaining`;
            inc `Bug.next`;
            let `Bug.23` : {Str, Str} = Struct {`Bug.str`, `Bug.next`};
            let `Bug.6` : Str = CallByName `Inspect.toStr` `Bug.23`;
            dbg `Bug.6`;
            dec `Bug.6`;
            let `Bug.22` : List Str = CallByName `List.append` `Bug.soFar` `Bug.str`;
            jump `Bug.20` `Bug.next` `Bug.22`;
    in
    jump `Bug.20` `#Derived_gen.IdentId(29)` `#Derived_gen.IdentId(30)`;

This write letBug.next: Str = CallByNameStr.joinWith`Bug.remaining Bug.24;`

On the second iteration of this function and beyond is also writing to Bug.str. This jump Bug.20 Bug.next Bug.22; makes both Bug.str and Bug.next the exact same allocation.

So I think Bug.next can't writing to an alloca that is hoisted above the joinpoint. It needs to write to a local alloca then at the last minute right before the jump Bug.20 Bug.next Bug.22; , we need to copy over the local to the longer living alloca.

@folkertdev:

  1. Is this a good fix? Do you have any better ideas for a fix?
  2. How easy/hard do you think this fix would be? seems really high priority cause it can break any tail recursive functions with pointers.
bhansconnect commented 9 months ago

I think this line is the root of the bug: https://github.com/roc-lang/roc/blob/f795d0856a8a68747aa7bc975827bd02ff8d4610/crates/compiler/gen_llvm/src/llvm/scope.rs#L94

We need to wait and only write to these right as a jump is called. We can not write to them early. By inserting them into symbols here, they get written to too early.

bhansconnect commented 9 months ago

actually, that might not quite be the root issue. Might just need to add an extra copy to each arg here: https://github.com/roc-lang/roc/blob/f795d0856a8a68747aa7bc975827bd02ff8d4610/crates/compiler/gen_llvm/src/llvm/build.rs#L3416-L3418

but something around these in general.

folkertdev commented 9 months ago

in either case the alloca must be outside of the loop otherwise the stack grows on every iteration. I do think the problem is that the "next" variable is never realized in this case. We do have some logic somewhere that tries to elide copies from one alloca into another, maybe that is at play here?

timotree3 commented 6 months ago

Oof I just took the time to minimize a miscompilation I encountered in Advent of Code and I think it's a duplicate of this:

interface Issue6139
    exposes [buggy]
    imports []

expect
    buggyAnswer = buggy "A" []
    buggyAnswer == ["A", "B", "C", "D"]

buggy = \node, seen ->
    if List.contains seen node then
        seen
    else
        dbg node # node = "B"
        nextNode = stepNode node
        dbg node # node = "C"

        buggy nextNode (List.append seen node)

stepNode = \node ->
    when node is
        "A" -> "B"
        "B" -> "C"
        "C" -> "D"
        "D" -> "A"
        _ -> crash ""