[GC/asm-generation] local variables go out of scope too early

GWRon commented 7 years ago

Coming from this thread: http://www.blitzmax.com/Community/posts.php?topic=107624#bottom

Brucey noted down that Blitzmax (v1.50) generates ASM-code which leads to a too early garbage collection of local variables.

SuperStrict
Framework Brl.Retro

Type TSample
    Method IsTrue:int()
        return True
    End Method

    Method Delete()
        'freeing a child - and potentially some "non-null"-data in there
        'would create a segfault
        print "TSample gets deleted - hope we do not reset/free potential child-elements now!"
    End Method
End Type

Function WasteSomeTime()
    For Local i:Int = 0 Until 10000
        local v:string = Rand(0,10000)
    Next
EndFunction

Local s:TSample = new TSample
if s.IsTrue()
    For local i:int = 0 to 3
        print i
        WasteSomeTime()
    Next
    print "TSample goes out of scope now..."
endif

Result:

./bmk makeapp -t console -quick -r -x "Testcodes/blitzmax_gcbug.bmx"
[ 90%] Processing:blitzmax_gcbug.bmx
[ 95%] Compiling:blitzmax_gcbug.bmx.console.release.linux.x86.s
flat assembler  version 1.68  (32768 kilobytes memory)
3 passes, 2069 bytes.
[100%] Linking:blitzmax_gcbug
Executing:blitzmax_gcbug
0
TSample gets deleted - hope we do not reset/free potential child-elements now!
1
2
3
TSample goes out of scope now...

If within Delete() children (created in the parent and only referenced there) would be reset/freed then within the "For-loop" any access to it would create a segfault.

Local variables used in an "if condition" should not get deleted before reachin the "endif".

GWRon commented 7 years ago

Here is an example exposing that it indeed leads to something not expected:

SuperStrict
Framework Brl.Retro

Type TParent
    Field c:TChild

    Method IsTrue:int()
        return True
    End Method

    Method GetChild:TChild()
        if not c then c = new TChild
        return c
    End Method

    Method Delete()
        'freeing the child - and potentially some "non-null"-data in there
        'would create a segfault
        print "TParent gets deleted - freeing the child now..."
        c.free()
        c = null
    End Method
End Type

Type TChild
    Field special:TSpecial = new TSpecial

    Method free()
        print "TChild gets freed"
        special = null
    End Method

    Method Delete()
        print "TChild gets deleted - free it"
        free()
    End Method
End Type

Type TSpecial
    Field i:int = 123
End Type

Function WasteSomeTime()
    For Local i:Int = 0 Until 10000
        local v:string = Rand(0,10000)
    Next
EndFunction

Local p:TParent = new TParent
if p.IsTrue()
    local c:TChild = p.GetChild()
    For local i:int = 0 to 3
        'after deletion it becomes "0" instead of "123"
        print "c.special.i = " + c.special.i
        WasteSomeTime()
    Next
    'this would keep it alive
    'p.IsTrue()

    print "TParent goes out of scope now..."
endif

Result (shortened):

c.special.i = 123
TParent gets deleted - freeing the child now...
TChild gets freed
c.special.i = 0
c.special.i = 0
c.special.i = 0
TParent goes out of scope now...

You can see that "special" gets modified inbetween.

chtisgit commented 7 years ago

Hi, I was reading about this and started debugging. It turns out, I didn't get the outputs that you posted. I'm on Linux and here these programs actually crash. So I checked out the stack trace with GDB. It looked weird but it crashed in MemExtend() because of a NULL pointer. I didn't look deeper into the GC or so.

After the fix, the program produces the expected output. Could you test the fix, too? I guess you are working on different OS.

GWRon commented 7 years ago

Ahh sorry, yes my output was already based on "maxmods/brl.mod". And I did not know that your suggestion was already included there:

void *bbMemExtend( void *mem,int size,int new_size ){
    void *p;
    p=bbMemAlloc( new_size );
    if (mem) {
        bbMemCopy( p,mem,size );
        bbMemFree( mem );
    }
    return p;
}

@ OS:

$ uname -a
Linux RonnyPC 3.19.0-32-generic #37~14.04.1-Ubuntu SMP Thu Oct 22 09:41:40 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

$ lsb_release -irc
Distributor ID: LinuxMint
Release:    17.3
Codename:   rosa

Edit: the expected output is the output of a "bugged" GC.

chtisgit commented 7 years ago

Oh okay, I thought we were getting different outputs because of different OSes. Unfortunately, I compiled in debug mode all the time... In release mode I'm now getting exactly your output. Back to gdb...

blitz-research commented 7 years ago

Only read the first post, but this is correct behaviour. After the 's.IsTrue()' expression is evaluated, 's' is technically dead as it is never used again, and may be GC'd at any time.

To test this, add something like 'Print s.IsTrue()' at/near the end of the program. This should keep 's' alive for longer, eg:

SuperStrict
Framework Brl.Retro

Type TSample
    Method IsTrue:Int()
        Return True
    End Method

    Method Delete()
        'freeing a child - and potentially some "non-null"-data in there
        'would create a segfault
        Print "TSample gets deleted - hope we do not reset/free potential child-elements now!"
    End Method
End Type

Function WasteSomeTime()
    For Local i:Int = 0 Until 10000
        Local v:String = Rand(0,10000)
    Next
EndFunction

Local s:TSample = New TSample
If s.IsTrue()   'Note: 's' still alive after this, ie: it's used again below.
    For Local i:Int = 0 To 3
        Print i
        WasteSomeTime()
    Next
EndIf
If s.IsTrue()   'Note: last use of 's' - can be GC'd any time in the future from now.
    For Local i:Int = 0 To 3
        Print i
        WasteSomeTime()
    Next
EndIf

GWRon commented 7 years ago

While I understand the logic behind it ("last use of something") I also would understand the logic to keep things if they are used in something "enclosing" (in this case if condition [...] endif).

Did not check if a variable can get collected in a while wend loop too - if not used (eg. it ends the app on incoming events within the while loop or breaks the while loop). But the behaviour (when keeping it until endif) would be more consistent then.

Is there an example why the current behaviour is "better" (means: are there things bugging out if you would keep the variable until the enclosing if endif ended?

Back to the sample: the problem with the GCing "too early" is that a child element could free things on "Delete()". Things which might still be in use. Developers might not not see a variable vanishing that early and are left puzzled then - like happening in the thread I linked in the OP.

blitz-research commented 7 years ago

If you want something to be 'deleted' at a particular, predictable time, add a Discard() method or something. Finalizers are not the way to do this. IMO, finalizers should be avoided altogether but no one likes that theory!

davecamp commented 7 years ago

This bug still exists and is very real. After the MemExtend fix this issue is still there. Using GWRon's example that ends with

Local p:TParent = New TParent
If p.IsTrue()
    Local c:TChild = p.GetChild()
    For Local i:Int = 0 To 3
        'after deletion it becomes "0" instead of "123"
        Print "c.special.i = " + c.special.i
        WasteSomeTime()
    Next
    'this would keep it alive
    'p.IsTrue()

    Print "TParent goes out of scope now..."
EndIf

In release builds I get...

c.special.i = 123
TParent gets deleted - freeing the child now...
TChild gets freed
c.special.i = 0
c.special.i = 0
c.special.i = 0
TParent goes out of scope now...

Which I see is wrong. At the very least 'c' and 'special' are still being accessed and need to kept alive. Am I missing somethng here? [edited]

blitz-research commented 7 years ago

The example is confusing, but I think what's happening is:

After 'Local c:TChild=p.getChild()' is called (actually, after getChild returns), p becomes 'dead' (it's never used again and isn't reachable via any other vars) and is therefore eligible for GC (note that this also frees up the register p was being stored in for reuse by the register allocator).
When GC does occur (sometime in the first call to waitSomeTime - GCCollect() works here too), p is indeed collected so the TParent.Delete() finalizer method is called. This calls c.free() which sets the 'special' field of 'c' to Null.
But this is the same 'c' that was returned by p.getChild, so the next time "Print c.special.i" is called, it should really cause a null object error. But we're in release mode so 'all bets are off'.
If I change 'special=Null' in TChild.free to 'special.i=456' it works as expected, ie: instead of c.special.i=0 in the output (instead of null object error!) you get c.special.i=456.

This is all 'expected behaviour' so there is no issue here, just I think a misunderstanding of how GC/finalizers work under the hood.

The cost of keeping all local variables 'live' until the end of block they were declared in (and why a block? kinda arbitrary but I assume it suits someone's use case...) would be high. There would be more register interference during register allocation and more register spillage to memory. Not gonna happen.

Note that all this has nothing to do with the bbMemExtend issue as far as I can see, no idea how the 2 got conflated. That was a 'c' spec issue (and I suspect a glibc bug).

davecamp commented 7 years ago

I totally understand, thank you for making it clearer as to what is happening. Note that removing the 'special = null' also results in what I'd expect to happen - ie the value of special.i is kept, so yes I agree that the code is performing exactly as it is written.

Thanks!

GWRon commented 7 years ago

Excuse the "confusing example. I just tried to replicate what the forum thread was about (the db module).

BTW: gccollect() did not collect it everytime...which is why in the thread that wastetime-function was added.

@ c nulled Yes I understand that it gets nulled and is not accessible any longer (or not guaranteed as it is "release mode"). And if I got you right the problem is that GC is freeing the parent as there is nothing "holding" it anymore. The delete() then dereferences "c" which is no problem..but the c.free() before the dereferencing is creating the trouble (by assuming it is safe now to clean up further things).

Thanks for the clarification and information about the "costs" to keep a local var alive until an "endif". It is still a bit "unclear" when things can get deleted by GC ...some users might think in terms of "blocks".

blitz-research commented 7 years ago

It is still a bit "unclear" when things can get deleted by GC

Strictly speaking, things can get GC'd if there are no 'live' references to them. A live reference is a reference (in memory or a cpu reg) that the compiler has determined may be used in future. Liveness of a reference changes from moment to moment, depending on what code has been executed and what code is yet to be executed. The compiler can only make a conservative estimate of liveness, eg: in the case of heap references (ie: fields), bmx assumes they always WILL be used in future so are always live, but a more sophisticated compiler could theoretically do better.

But this issue is really only due to the use of finalizers (ie: method Delete) and finalizers are evil. They may execute sooner (or later, or not at all) than you expect, depending on the whims of the above algorithm (which is affected by compiler implementation, optimizations etc), so are next to useless for anything deterministic. IMO, they should ONLY be used for 'just in case' cleanups (if at all) in which case you can completely ignore the question of when GC happens.

In reality, GC is only a memory management system, and it really only guarantees that memory will not be prematurely freed before you are finished with it (and finalizers are really just an ugly 'PreFree()' hack). Beyond that, there are no guarantees. Compiler implementation and optmizations have a major effect on how it works in practice, ie: when memory is actually alloced/freed, which makes it very hard to predict when finalizers will happen.

On Fri, Mar 10, 2017 at 11:26 AM, Ronny Otto notifications@github.com wrote:

Excuse the "confusing example. I just tried to replicate what the forum thread was about (the db module).

BTW: gccollect() did not collect it everytime...which is why in the thread that wastetime-function was added.

@ c nulled Yes I understand that it gets nulled and is not accessible any longer (or not guaranteed as it is "release mode"). And if I got you right the problem is that GC is freeing the parent as there is nothing "holding" it anymore. The delete() then dereferences "c" which is no problem..but the c.free() before the dereferencing is creating the trouble (by assuming it is safe now to clean up further things).

Thanks for the clarification and information about the "costs" to keep a local var alive until an "endif". It is still a bit "unclear" when things can get deleted by GC ...some users might think in terms of "blocks".

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/blitz-research/blitzmax/issues/8#issuecomment-285503726, or mute the thread https://github.com/notifications/unsubscribe-auth/ADU3QnhxXH46Ag7jruMPI0K15p-ZKYZTks5rkHyogaJpZM4Lq1XS .

blitz-research / blitzmax

[GC/asm-generation] local variables go out of scope too early #8