nim-lang / Nim

Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
https://nim-lang.org
Other
16.64k stars 1.47k forks source link

What is the expected behaviour for a proc replacing its own closure? #23044

Open can-lehmann opened 12 months ago

can-lehmann commented 12 months ago

Description

I am currently debugging an issue in owlkettle and managed to reduce to this minimal example.

var p: proc()

proc createCallback(x: int): proc() =
  result = proc() =
    for it in 0..<100:
      p = createCallback(it)
    echo x

for it in 0..<10:
  p = createCallback(10)
  p()

Nim Version

$ nim --version
Nim Compiler Version 2.1.1 [Linux: amd64]
Compiled at 2023-10-14
Copyright (c) 2006-2023 by Andreas Rumpf

git hash: f5d70e7fa7195658e3200f71a1653e07fe81275a
active boot switches: -d:release

Current Output

Running it with nim compile -r test3 outputs:

99
99
99
99
99
99
99
99
99
99

Running it with nim compile -d:useMalloc -r test3 outputs:

10
99
99
99
99
99
99
99
99
99

Running it with nim compile --mm:boehm -r test3 outputs:

10
10
10
10
10
10
10
10
10
10

Running it with nim js test3.nim; node test3.js outputs:

10
10
10
10
10
10
10
10
10
10

Expected Output

I would personally expect it to be

10
10
10
10
10
10
10
10
10
10

Possible Solution

I assume that this happens because the closure gets deallocated while it is executed. If that is the case, the refcount would need to be increased by one upon entering a closure and decreased afterwards. I am not that familiar with ARC/ORC internals though, so this might be wrong.

shirleyquirk commented 12 months ago

i've played with this a bit, in nim and in c++, and i think it's undefined behaviour. whether the output should be 10,99 or undefined depends on so many subtleties

by that i mean:

proc closure(self: Environment) = 
  # here we destroy the global p and replace with a 'new' one.
  # including the environment. so, does 'self' point to the old env, or to the new one?
  # or, since we've destroyed the old env, which is the same pointed to by 'self'
  # is self now invalid, pointing to garbage?
  # garbage which, depending on the allocator could be an old value?

fwiw it's possible to get 10,99 or garbage output with versions of this written in c++ godbolt

in versions written in nim sometimes valgrind reports that the closure is accessing free'd memory sometimes, running valgrind changes the output!

can-lehmann commented 12 months ago

In your example, assuming that Environment is a reference counted, heap allocated object, everything would work fine though, as the reference count of self is incremented upon entering closure, which prevents a deallocation.

Edit: C++ behaves differently here, as (at least I assume that) its closures are value types.

can-lehmann commented 12 months ago

To further demonstrate why I think that this should not be UB, lets just manually increase the refcount of the closure by one upon entering the callback.

var p: proc()

proc createCallback(x: int): proc() =
  result = proc() =
    var r = p
    for it in 0..<100:
      p = createCallback(it)
    echo x
    echo r.repr

for it in 0..<10:
  p = createCallback(10)
  p()

Now everything works as expected. And valgrind does not report any errors anymore,

Araq commented 12 months ago

Looks like an ARC/ORC related optimizer bug to me. What does --mm:orc --cursorInference:off produce?

shirleyquirk commented 12 months ago

i agree that the environment of p has to be manually increfed

you need to do that to make the c++ version pass valgrind as well.

std::shared_ptr<std::function<void()>> p;
...
return [x]{
auto r = p;...}
shirleyquirk commented 12 months ago

-d:useMalloc --mm:arc --cursorInference:off also produces 10,99... and valgrind warns of a use after free

shirleyquirk commented 12 months ago

what happens is:

tmp = createCallback(10)
=sink(p,tmp)
p():
  void* colonenvP_ = ClE_0;
  tmp = createCallback(99)
  =sink(p,tmp) #destroys p.env, replaces with new one.  
  echo colonenvP_ #use after free
ringabout commented 11 months ago

Slightly simplified

var p: proc()

proc createCallback(x: int): proc() =
  proc bark() =
    p = createCallback(0)
    p = createCallback(1)
    echo x
  result = bark

proc foo =
  p = createCallback(10)
  p()

foo()
ringabout commented 11 months ago

Perhaps, we could keep a local copy of the envs in that function to ensure its aliveness until the exiting of the closure scopes like

var p: proc()

proc createCallback(x: int): proc() =
  proc bark() =
    let tmpEnv
    `=copy`(tmpEnv, env) # keeps envs alive
    p = createCallback(0)
    p = createCallback(1)
    echo tmpEnv.x
  result = bark

proc foo =
  p = createCallback(10)
  p()

foo()

Though "$1 = ($2) ClE_0;$n" is hardcoded in the backends, is there a good solution to clean it up?

Probably it's a good compromise to just inc/dec the rc of ClE_0

ringabout commented 11 months ago

I'm not sure of the good solution for this problem, but if you want to manually increase the rc

from system {.all.} import nimIncRef

proc incRef(x: proc () {.closure.}) =
  if x != nil:
    nimIncRef(rawEnv(x))

proc decRef(x: proc () {.closure.}) =
  var y {.cursor.} = x
  `=destroy`(y)

var p: proc()

proc createCallback(x: int): proc() =
  result = proc() =
    let tmp {.cursor.} = p
    incRef(tmp)
    for it in 0..<100:
      p = createCallback(it)
    echo x
    decRef(tmp)

for it in 0..<10:
  p = createCallback(10)
  p()
ringabout commented 11 months ago

Here is what I think of is happening:

var p: proc()

proc createCallback(x: int): proc() =
  proc bark() =
    # ClE_0.rc = 1; ClE_0.x = 10
    p = createCallback(0)
    # `=sink` or `=copy` decRef(ClE_0)
    # ClE_0.rc = 0; ClE_0.x = 10 => the memory is reclaimed; but not zeroed
    p = createCallback(999)
    # With default alloc: ClE_0 is corrupted by the newly created memory of `createCallback(1)`
    # but malloc is not affected in this case

    echo x
  result = bark

proc foo =
  p = createCallback(10)
  p()

foo()

refc: 10 orc: 999 orc -d:useMalloc: 10

Araq commented 6 months ago

Somewhere the codegen needs to introduce a temporary for the global variable to keep it alive but it's contrived code.