nim-lang / Nim

Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
https://nim-lang.org
Other
16.22k stars 1.46k forks source link

SIGSEGV with object variants and RTTI #23690

Closed alex65536 closed 2 weeks ago

alex65536 commented 3 weeks ago

Description

The simplest reproducer I have found by now is:

import std/sequtils

type
  SomeObj* = object of RootObj

  Item* = object
    case kind*: 0..1
    of 0:
      a*: int
      b*: SomeObj
    of 1:
      c*: string

  ItemExt* = object
    a*: Item
    b*: string

proc do1(x: int): seq[(string, Item)] =
  result = @[("zero", Item(kind: 1, c: "first"))]

proc do2(x: int, e: ItemExt): seq[(string, ItemExt)] =
  do1(x).map(proc(v: (string, Item)): auto = (v[0], ItemExt(a: v[1], b: e.b)))

echo do2(0, ItemExt(a: Item(kind: 1, c: "second"), b: "third"))

This code stably leads to segfault under my amd64 Linux machine (tested both on 2.0.4 and devel)

Nim Version

$ nim -v
Nim Compiler Version 2.1.1 [Linux: amd64]
Compiled at 2024-06-06
Copyright (c) 2006-2024 by Andreas Rumpf

git hash: 69d0b73d667c4be9383f29cda3f70e411995d9af
active boot switches: -d:release

Also reproducible on stable 2.0.4.

Current Output

Traceback (most recent call last)
/home/user/repro.nim(24) repro
/home/user/repro.nim(22) do2
/home/user/.choosenim/toolchains/nim-#devel/lib/system/alloc.nim(1068) dealloc
/home/user/.choosenim/toolchains/nim-#devel/lib/system/alloc.nim(967) rawDealloc
/home/user/.choosenim/toolchains/nim-#devel/lib/system/alloc.nim(770) addToSharedFreeListBigChunks
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
Segmentation fault

Expected Output

@[("zero", (a: (kind: 1, c: "first"), b: "third"))]

Possible Solution

No response

Additional Information

Some extra debugging shows more details:

Debugging points out that eqcopy implementation is likely at fault. The compiled C code is as follows:

N_LIB_PRIVATE N_NIMCALL(void, eqcopy___repro_u284)(tyObject_ItemExt__D8Kw4o1swHAWDzvwwWvsFw* dest_p0, tyObject_ItemExt__D8Kw4o1swHAWDzvwwWvsFw* src_p1) {
        tyObject_Item__jnMhVK55Yw3PPq6Gy9aXzMQ colontmp_;
        colontmp_ = (*dest_p0).a;
        nimZeroMem((void*)(&(*dest_p0).a), sizeof(tyObject_Item__jnMhVK55Yw3PPq6Gy9aXzMQ));
        (*dest_p0).a = TM__ckn2SibXCdoGNfJOD0JenA_5;
        (*dest_p0).a.kind = (*src_p1).a.kind;
        switch ((*dest_p0).a.kind) {
        case ((NI)0):
        {
                (*dest_p0).a._kind_1.a = (*src_p1).a._kind_1.a;
        }
        break;
        case ((NI)1):
        {
                eqcopy___system_u2621((&(*dest_p0).a._kind_2.c), (*src_p1).a._kind_2.c);
        }
        break;
        default: __builtin_unreachable();
        }
        switch (colontmp_.kind) {
        case ((NI)0):
        {
        }
        break;
        case ((NI)1):
        {
                if (colontmp_._kind_2.c.p && !(colontmp_._kind_2.c.p->cap & NIM_STRLIT_FLAG)) {
 deallocShared(colontmp_._kind_2.c.p);
}
        }
        break;
        default: __builtin_unreachable();
        }
        eqcopy___system_u2621((&(*dest_p0).b), (*src_p1).b);
}

It does the following:

alex65536 commented 3 weeks ago

!nim c

github-actions[bot] commented 3 weeks ago
:penguin: Linux bisect by @alex65536 (contributor)
devel :-1: FAIL

Output

``` ```

IR

Compiled filesize 0 bytes (0 bytes) ```cpp ```

Stats

  • Started 2024-06-07T03:45:24
  • Finished 2024-06-07T03:45:24
  • Duration

AST

```nim ```
stable :-1: FAIL

Output

``` ```

IR

Compiled filesize 0 bytes (0 bytes) ```cpp ```

Stats

  • Started 2024-06-07T03:45:25
  • Finished 2024-06-07T03:45:25
  • Duration now

AST

```nim ```
2.0.4 :-1: FAIL

Output

``` ```

IR

Compiled filesize 0 bytes (0 bytes) ```cpp ```

Stats

  • Started 2024-06-07T03:45:25
  • Finished 2024-06-07T03:45:25
  • Duration now

AST

```nim ```
2.0.0 :-1: FAIL

Output

``` ```

IR

Compiled filesize 0 bytes (0 bytes) ```cpp ```

Stats

  • Started 2024-06-07T03:45:28
  • Finished 2024-06-07T03:45:28
  • Duration now

AST

```nim ```
1.6.20 :-1: FAIL

Output

``` ```

IR

Compiled filesize 0 bytes (0 bytes) ```cpp ```

Stats

  • Started 2024-06-07T03:45:31
  • Finished 2024-06-07T03:45:31
  • Duration now

AST

```nim ```
1.4.8 :-1: FAIL

Output

``` ```

IR

Compiled filesize 0 bytes (0 bytes) ```cpp ```

Stats

  • Started 2024-06-07T03:45:33
  • Finished 2024-06-07T03:45:33
  • Duration now

AST

```nim ```
1.2.18 :-1: FAIL

Output

``` ```

IR

Compiled filesize 0 bytes (0 bytes) ```cpp ```

Stats

  • Started 2024-06-07T03:45:35
  • Finished 2024-06-07T03:45:35
  • Duration

AST

```nim ```
1.0.10 :-1: FAIL

Output

``` ```

IR

Compiled filesize 0 bytes (0 bytes) ```cpp ```

Stats

  • Started 2024-06-07T03:45:37
  • Finished 2024-06-07T03:45:37
  • Duration now

AST

```nim ```
Stats
  • GCC 11.4.0
  • Clang 14.0.0
  • NodeJS 20.3
  • Created 2024-06-07T03:44:52Z
  • Comments 1
  • Commands nim c --run -d:nimDebug -d:nimDebugDlOpen -d:ssl -d:nimDisableCertificateValidation --forceBuild:on --colors:off --verbosity:0 --hints:off --lineTrace:off --nimcache:/home/runner/work/Nim/Nim --out:/home/runner/work/Nim/Nim/temp /home/runner/work/Nim/Nim/temp.nim

:robot: Bug found in 16 minutes bisecting 8 commits at 0 commits per second

alex65536 commented 3 weeks ago

!nim c

import std/sequtils

type
  SomeObj* = object of RootObj

  Item* = object
    case kind*: 0..1
    of 0:
      a*: int
      b*: SomeObj
    of 1:
      c*: string

  ItemExt* = object
    a*: Item
    b*: string

proc do1(x: int): seq[(string, Item)] =
  result = @[("zero", Item(kind: 1, c: "first"))]

proc do2(x: int, e: ItemExt): seq[(string, ItemExt)] =
  do1(x).map(proc(v: (string, Item)): auto = (v[0], ItemExt(a: v[1], b: e.b)))

echo do2(0, ItemExt(a: Item(kind: 1, c: "second"), b: "third"))
ringabout commented 3 weeks ago

@alex65536 wow, your insights are really helpful. I think it's mostly likely that copy needs to zero memery the union before assignement. I'm working on it. Thank you very much!

ringabout commented 3 weeks ago

It crashed with 1.6.16 with --mm:orc -d:useMalloc

alex65536 commented 3 weeks ago

Digging further into the compiler code, I noticed the following:

  1. Here is the algorithm for copy-assignment of object variants. The main take here is that we move the old object into the new location and call wasMoved for the old object.
  2. wasMoved is basically implemented here. Among other things, it calls resetLoc(), which in our case (code) generates nimZeroMem + default-initialization.
  3. Zeroing memory is OK, but re-intialization with the default value for this moved-out object is actually the source of this bug.

I am not proficient in the compiler internals at all, so everything below this text in this comment is just a wild guess from a person who has seen the compiler code today for the first time:

beef331 commented 3 weeks ago

how destructors of old values behave when they encounter zeroed memory

Destructors are supposed to be written in such a way that 0'd memory indicates "this was moved"

alex65536 commented 3 weeks ago

Destructors are supposed to be written in such a way that 0'd memory indicates "this was moved"

OK, thank you, then zeroing is a nice idea indeed :)

And what happens if a =copy() implementation hooked by user receives zeroed memory as a destination? Or is there anything that prevents such thing from happening?

beef331 commented 3 weeks ago

You call =destroy then copy the data over, there is no special logic for data that is 0'd =wasMoved should disarm =destroy.

Araq commented 3 weeks ago

Destructors are supposed to be written in such a way that 0'd memory indicates "this was moved"

Note that we're moving away from 0'd memory to the more abstract "=wasMoved/default state".