Raku / doc

🦋 Raku documentation
https://docs.raku.org/
Artistic License 2.0
290 stars 293 forks source link

Are special methods called when an object is destroyed? #2097

Open JJ opened 6 years ago

JJ commented 6 years ago

The problem

Asked by @briandfoy in this question in StackOverflow

DESTROY is mentioned here, in the metamodel. There seems to be a problem with indexing, too.

Suggestions

jnthn commented 6 years ago

The answer @lizmat wrote on the SO question is pretty much right (and I've no idea why it got a down vote).

The DESTROY method in Perl 6 scheduled to run after an object has been deemed otherwise unreachable by the garbage collector. The garbage collector runs when it needs to and even then will, for efficiency reasons, often only consider part of the heap. At program termination time, there is no requirement to run the GC; since its job is to manage memory, exiting and letting the OS take back all the memory the process requested is far more efficient. Perl 6 does not commit to any particular GC algorithm, it only requires that circular data structures can be collected without problem. This allows it to run on various VMs, and gives those VMs the implementation freedom to adopt new, more modern, GC algorithms over time, or to offer multiple GC algorithms which make different trade-offs.

Things you cannot rely on:

Things you can rely on:

All in all, DESTROY is useful as a fallback for clearing up resources that the program failed to release otherwise - and perhaps warning about that - but it's not a mechanism for resource management due to its lack of ability to provide timeliness. LEAVE and END provide reliably mechanisms to do those.

lizmat commented 6 years ago

"The DESTROY method for a particular object instance will be invoked at most once, but never more "

I think this needs to be further refined to:

"The DESTROY method for a particular object instance will be invoked at most once by the garbage collection logic, but never more"

This is because if you e.g. would have a LEAVE block call .DESTROY on the object, it could get executed more than once:

my int $seen;
class A {
    method DESTROY { ++$seen }
}
A.new.DESTROY for ^50000;
say "DESTROY called $seen times"     # DESTROY called 71489 times

I'm not sure where the flag is kept that the DESTROY method has been called, but maybe we should codegen this to be set at execution of the DESTROY method itself, rather than by the garbage collection logic?

rafaelschipiura commented 6 years ago

But should the garbage collector really not call DESTROY if it was already called?

Or should DESTROY be written defensively in the cases it's called explicitly somewhere else?

jnthn commented 6 years ago

I think this needs to be further refined to:

"The DESTROY method for a particular object instance will be invoked at most once by the garbage collection logic, but never more"

Yes, that's a good idea, along with a note to discourage manually calling it.

But should the garbage collector really not call DESTROY if it was already called?

Actually the garbage collector has no idea about a DESTROY method, it just sticks the objects into a list and something Perl 6 side does the invocation of DESTROY.

In general, though, it's pretty much a given that calling any method with an uppercase name explicitly is bad form. It also goes for BUILD, TWEAK, and MAIN: if you call them yourself, that has no impact on whether they will also be called by the thing that was meant to call them. The point of the naming scheme is that these methods will be called for you.

Or should DESTROY be written defensively in the cases it's called explicitly somewhere else?

I think in many cases one will be doing that anyway, given the main use case is to perform cleanup logic that ideally would have been performed earlier on, in a more timely manner.

The main thing to point out here is probably that DESTROY is not where to primarily write your cleanup logic, but rather to write it somewhere user-facing so it can be done in a timely way and then invoke it from DESTROY.

lizmat commented 6 years ago

I think LEAVE blocks are fine generally, but there are some cases where you want to return an object from a sub / method and want it collected when the callers scope is left, preferably without the caller's scope needing to do something. I came up with the following module, to be put in the ecosystem:

# The class that will finalize what got registered to it
class FINALIZER {
    has @.blocks;

    PROCESS::<$FINALIZER> = FINALIZER.new;
    END PROCESS::<$FINALIZER>.FINALIZE;

    method FINALIZE() {
        say "finalizing";
        $_() for @!blocks
    }
    method register(&a) {
        say "registering";
        ($*FINALIZER //= FINALIZER.new).blocks.push(&a);
    }
}

sub dbiconnect($string) {
    LEAVE say "leaving $string";
    my $dbh = $string;
    FINALIZER.register: { say "leaving registered with $dbh" }
    $dbh
}

LEAVE say "leaving program";
{
    LEAVE say "leaving outer";
    {
# The trick here is that the dynamic variable $*FINALIZER is actually
# created at compile time in this scope, while the method will only
# be called when the this scope is left (and only if something registered
# it)
        LEAVE my $*FINALIZER.?FINALIZE;
        my $dbh = dbiconnect("frobnicate");
        say "doing stuff with $dbh";
    }
}

Now, the thing I think really needs some syntactic sugar, is the LEAVE my $*FINALIZER.?FINALIZE; part. I have found no way to export a LEAVE phaser to a block nicely, e.g. as part of a use FINALIZER statement. Or am I missing something?

lizmat commented 6 years ago

I've added FINALIZER to the Perl 6 ecosystem. As far as I'm concerned, this module supplies a very powerful alternative to the timely destruction of Perl 5.

JJ commented 6 years ago

1606 is related to this one. I didn't realize until now.

JJ commented 6 years ago

The indexing thing has been fixed in 612f995bd45282b6692d15af03f1cd07e25c3108