nim-lang / Nim

Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
https://nim-lang.org
Other
16.41k stars 1.47k forks source link

[arc] of operation segfaults for a ptr object containing traced reference #19205

Closed MaskRay closed 1 year ago

MaskRay commented 2 years ago

For an untraced object containing traced reference, of operation may lead to segfault in --gc:arc --gc:orc mode, but not in --gc:refc mode.

I am not clear what the semantics are, but https://nim-lang.org/docs/manual.html#types-mixing-gc-ed-memory-with-nimptr seems to suggest that a ptr object with traced reference fields is fine.

Example

type
  InputSectionBase* {.inheritable.} = object
    relocations*: seq[int]   # traced reference. string has a similar SIGSEGV.
  InputSection* = object of InputSectionBase

proc foo(sec: var InputSectionBase) = 
  if sec of InputSection:  # this line SIGSEGV.
    echo 42

var sec = create(InputSection)
sec[] = InputSection(relocations: newSeq[int]())
foo sec[]

Current Output

% nim c -r --gc:arc x.nim
...
Hint: /tmp/d/x  [Exec]
Traceback (most recent call last)
/tmp/d/x.nim(12)         x
/tmp/d/x.nim(7)          foo
/home/maskray/.choosenim/toolchains/nim-#devel/lib/system/arc.nim(233) isObj
SIGSEGV: Illegal storage access. (Attempt to read from nil?)
Error: execution of an external program failed: '/tmp/d/x '

SIGSEGV with both 1.6.0 (stable) and 1.7.1 (devel).

Expected Output

% nim c -r --gc:refc x.nim
...
Hint: /tmp/d/x  [Exec]
42

Additional Information

% nim -v
Nim Compiler Version 1.7.1 [Linux: amd64]
Compiled at 2021-11-25
Copyright (c) 2006-2021 by Andreas Rumpf

git hash: 0d0c249074d6a1041de16108dc247396efef5513
active boot switches: -d:release

lib/system/arc.nim(233) has

proc isObj(obj: PNimTypeV2, subclass: cstring): bool {.compilerRtl, inl.} =
  proc strstr(s, sub: cstring): cstring {.header: "<string.h>", importc.}

  result = strstr(obj.name, subclass) != nil   ### line 233, obj is nil
% nim -v
Nim Compiler Version 1.6.0 [Linux: amd64]
Compiled at 2021-10-19
Copyright (c) 2006-2021 by Andreas Rumpf

git hash: 727c6378d2464090564dbcd9bc8b9ac648467e38
active boot switches: -d:release
Araq commented 2 years ago

Instead of InputSectionBase* {.inheritable.} = object use InputSectionBase* = object of RootObj. This is not supported, the compiler should produce an error message for this.

ringabout commented 2 years ago

Instead of InputSectionBase {.inheritable.} = object use InputSectionBase = object of RootObj. This is not supported, the compiler should produce an error message for this.

I changed the example:

type
  InputSectionBase* = object of RootObj
    relocations*: seq[int]   # traced reference. string has a similar SIGSEGV.
  InputSection* = object of InputSectionBase

proc foo(sec: var InputSectionBase) = 
  if sec of InputSection:  # this line SIGSEGV.
    echo 42

var sec = create(InputSection)
sec[] = InputSection(relocations: newSeq[int]())
foo sec[]

It produces the same errors. IMO It is an issue of ARC/ORC.

> nim-#devel\lib\system\arc.nim(233) isObj
SIGSEGV: Illegal storage access. (Attempt to read from nil?)

I'm still looking into it, I guess it may be caused by the sink operation which doesn't move/copy the type information of inherited object.

Araq commented 2 years ago

I'm still looking into it, I guess it may be caused by the sink operation which doesn't move/copy the type information of inherited object.

Likely that is exactly what is going on.

khaledh commented 2 years ago

I'm running into a variation of this issue, which uses ref semantics for the base and derived types, and while using --os:any and -passl:"-nostdlib" (I'm writing a kernel, and providing a few routines that are required from libc). I can't reproduce this in a typical environment (i.e. not --os:any).

If I use untraced ptr instead of a traced ref, it works without issues. But I need to use ref since I'm trying to add a new type of stream: MemoryStream, similar to StringStream and FileStream. The issue is triggered if I try to use my MemoryStream through the base Stream methods, e.g. atEnd(s: Stream).

The following is a self-contained nim file that triggers a SIGSEGV upon trying to convert the base object to the derived object. Sorry for the long code, but it needs the supporting libc routines.

$ nim c -d:release --os:any --mm:arc -d:useMalloc -d:noSignalHandler --noMain \
    --passC:"-ffreestanding -fno-stack-protector -mno-red-zone -masm=intel" \
    --passL:"-nostdlib -e mymain" refsubtype.nim

$ ./refsubtype
fish: Job 1, './refsubtype' terminated by signal SIGSEGV (Address boundary error)
# refsubtype.nim

# compile and run with:
#   nim c -r -d:release --os:any --mm:arc -d:useMalloc -d:noSignalHandler --noMain \
#     --passC:"-ffreestanding -fno-stack-protector -mno-red-zone -masm=intel" \
#     --passL:"-nostdlib -e mymain" refsubtype.nim

# libc support

proc memset*(p: pointer, value: cint, size: csize_t): pointer {.exportc.} =
  let pp = cast[ptr UncheckedArray[byte]](p)
  let v = cast[byte](value)
  for i in 0..<size:
    pp[i] = v
  return p

proc memcpy*(dst: pointer, src: pointer, size: csize_t): pointer
    {.exportc, codegenDecl: "$# $#(void * restrict dst,  const void * restrict src,  size_t size)".} =
  let d = cast[ptr UncheckedArray[byte]](dst)
  let s = cast[ptr UncheckedArray[byte]](src)
  for i in 0..<size:
    d[i] = s[i]
  return dst

proc strstr*(str: cstring, substr: cstring): cstring
    {.exportc, codegenDecl: "$# $#(const char* str, const char* substr)".} =
  let s = cast[ptr UncheckedArray[byte]](str)
  let ss = cast[ptr UncheckedArray[byte]](substr)
  var i = 0
  while s[i] != 0:
    var j = 0
    while ss[j] != 0 and s[i + j] != 0 and ss[j] == s[i + j]:
      inc(j)
    if ss[j] == 0:
      return cast[cstring](addr s[i])
    inc(i)
  return nil

proc exit*(code: cint) {.exportc.} =
  asm """
    mov eax, 60
    syscall
    :
    :"D"(`code`)
  """

# malloc support

var
  heap: array[4096, byte]
  heapBumpPtr: int

proc malloc*(size: csize_t): pointer {.exportc.} =
  result = cast[pointer](heapBumpPtr)
  inc heapBumpPtr, size.int

proc calloc*(num: csize_t, size: csize_t): pointer {.exportc.} =
  result = malloc(size * num)

proc free*(p: pointer) {.exportc.} =
  discard

proc realloc*(p: pointer, new_size: csize_t): pointer {.exportc.} =
  result = malloc(new_size)
  discard memcpy(result, p, new_size)
  free(p)

# main program

proc mymain*() {.exportc.} =
  heapBumpPtr = cast[int](addr heap)

  type
    Base = ref BaseObj
    BaseObj = object of RootObj

    Derived = ref DerivedObj
    DerivedObj = object of BaseObj

  proc foo(b: Base) =
    var d = Derived(b)  # SIGSEGV

  var d: Derived
  new(d)
  foo(d)

  quit 0
$ nim -v
Nim Compiler Version 1.6.2 [Linux: amd64]
Compiled at 2021-12-17
Copyright (c) 2006-2021 by Andreas Rumpf

git hash: 9084d9bc02bcd983b81a4c76a05f27b9ce2707dd
active boot switches: -d:release
khaledh commented 2 years ago

@Araq If you have a few minutes can you elaborate on whether the above is a bug or not?

Araq commented 2 years ago

Takes more than a few minutes to answer this question.

ringabout commented 2 years ago

I ran a gdb debug. It seems to be a bug.

strstr (substr=0x555555559340 "|compiler.test.DerivedObj|compiler.test.BaseObj|RootObj|", str=0x0)
    at /home/wind/nought/Nim/test.nim:37
37        while s[i] != 0:
(gdb) bt
#0  strstr (substr=0x555555559340 "|compiler.test.DerivedObj|compiler.test.BaseObj|RootObj|", str=0x0)
    at /home/wind/nought/Nim/test.nim:37
#1  isObj (obj=0x55555555c1a0 <NTIv2__DaWTLn80lskcdCjhrn47Sw_>, 
    subclass=0x555555559340 "|compiler.test.DerivedObj|compiler.test.BaseObj|RootObj|")
    at /home/wind/.choosenim/toolchains/nim-1.6.2/lib/system/arc.nim:233
#2  foo__test_109 (b=<optimized out>) at /home/wind/nought/Nim/test.nim:88
#3  mymain () at /home/wind/nought/Nim/test.nim:92

The type of d seems to be erased.

Araq commented 1 year ago

This bug has been fixed some time ago.