JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.78k stars 5.49k forks source link

`with` for deterministic destruction #7721

Open klaufir opened 10 years ago

klaufir commented 10 years ago

Deterministic destruction Deterministic destruction is a guarantee that some specified resources will be released when exiting from the enclosing block even if we exit from the block by the means of exceptions.

Example In python deterministic destruction is done using the with statement.

with open("myfile","w") as f:
    f.write("Hello")

Considering the code above we don't have to worry about closing a file explicitly. After the with block, the file will be closed. Even when something throws an exception inside the with block, the resources handled by the with statement will be released (in this case closed).

Other languages

Julia?

It is my firm belief that Julia also needs to have a feature supporting deterministic destruction. We have countless cases when a resource needs to be closed at soon as possible: serial ports, database connections, some external api handles, files, etc. In cases like these deterministic destruction would mean cleaner code, no explicit close() / release() calls and no sandwiching the code in try .. catch .. finally blocks.

prcastro commented 10 years ago

The do syntax doesn't handle this?

jakebolewski commented 10 years ago

You can do the same things with julias do notation anonymous function syntax and with macros. Try finally also lends itself to deterministic destruction.

JeffBezanson commented 10 years ago

I brought this up briefly in #4664. I think with and finalize would be good.

klaufir commented 10 years ago

I know it can be done in a custom way by hacking some macros, but it would be much more beneficial if we had this feature in combination with support from the standard library - as it is done in python. Having a standard way of doing it would make new libraries conform to this standard way. Then the users don't have to worry about the method name of the finalizer (Is it close(), release(), free()?).

timholy commented 10 years ago

@klaufir, it's not hacking some macros. In Julia your example is

open("myfile", "w") do file
    write(file, "Hello")
end

and the file is automatically closed for you. See? No macros :smile:.

But there are places where one might want to use finalizers in most circumstances yet be able to force them to run in specific situations. So there might be room for some new features, but it's not like there isn't already a way to do this kind of thing.

klaufir commented 10 years ago

Oh, I see there is a standard way already. Sorry for the noise.

StefanKarpinski commented 10 years ago

That is the standard idiom – writing code to ensure that some code is always called upon exit is still manual though. I've often wanted something like Go's defer or D's scope exit clauses. Note that these are more general in a way because you can ensure that any expression is executed on scope exit.

JeffBezanson commented 10 years ago

I think with would be better than what we do now, since different finalizeable types wouldn't need to re-implement the pattern used by open.

timholy commented 10 years ago

I agree that implementing that pattern usually requires a trip to the manual, and occasionally interferes with some other desirable call syntax. Ran into the latter with CUDArt.

awf commented 9 years ago

Just checking: the general feeling is that "do" doesn't do it, and "with" is still desirable?

StefanKarpinski commented 8 years ago

I'm convinced at this point that do syntax isn't sufficient, but I also think that Python's with syntax doesn't quite cut it either. What seems to be needed is automatic insertion of a finalize on the result of a function call when the appropriate scope exits. One problem with both do and with is that they require nesting/indentation. It's common to do a bunch of setup and then do the main work and then do all the corresponding teardown. Even using the with construct, we'd have to write something like this:

with open("input") as r
    with open("output", "w") as w
        # do actual work
    end
end

Another problem with both syntaxes is that they cannot be used inline, making them unhelpful, e.g. for the common problem of wanting to open a file only to pass it to a function and then close it when that call returns (see here for example).

I noticed that the syntax f(x...)! is available so I'm going to throw this out there as syntax for doing y = f(x...) and inserting finalize(y) in at the point where the returned value y goes out of scope. This would allow us to write the above examples as:

r = open("input")!
w = open("output", "w")!
# do actual work

Calls to finalize(w) and finalize(r) are inserted automatically when r and w go out of scope. Actually, it's more than that since the finalize calls are guaranteed no matter how the stack unwinds. You can also use this effectively without binding to a local variable, just passing it to a function call:

write(open("file","w")!, data)

Since the result of the open call goes out of scope after this line, this becomes something like this:

try f = open("file","w")
    write(f, data)
finally
    finalize(f)
end

So that addresses https://github.com/JuliaLang/julia/pull/14546 in a systematic way without adding any new methods and can eliminate many of the functions littering https://github.com/JuliaLang/julia/issues/14608.

JeffBezanson commented 8 years ago

Wow, I kind of like that idea. It would be excellent if resource-backed objects only needed to implement close and one form of open, and we hardly ever used gc finalizers (or eliminated them entirely).

awf commented 8 years ago

Just for a second, can we imagine inverting the syntax: so v = ...! covers the unusual case where I do want to wait for GC to finalize v? How much existing code depends on that? Which bugs are worse: referencing prematurely-finalized objects, or leaking resources? The former are pretty easy to detect, at least at the cost of the finalizer explicitly putting the object into an invalid state.

JeffBezanson commented 8 years ago

See also #11207

StefanKarpinski commented 8 years ago

@awf you don't want to translate every function call into a try/catch with a corresponding finalize call.

kmsquire commented 8 years ago

Sounds somewhat like golang's defer (also mentioned by @quinnj in #11207), except that defer is more general (and more specific about what exactly is going to happen).

JeffBezanson commented 8 years ago

I think we can have defer and have ! be syntax for defer close(x) since that's what you need 99.9% of the time. The verbosity reduction is huge:

x = open(file); defer close(x)
write(x, data)

vs

write(open(file)!, data)
kmsquire commented 8 years ago

Nice. +1

tkelman commented 8 years ago

I like the idea, but not the syntax. Sigils for resource management starts getting into rust levels of "what is going on" for newcomers unfamiliar with what a trailing ! might mean. Can this be done with a macro or function/type wrapper? If we want deterministic destruction, the question of when it occurs should be easy to answer at a glance. It can be hard enough as is to explain how julia's scoping works sometimes.

StefanKarpinski commented 8 years ago

@JeffBezanson: shouldn't we be having this call finalize(x) rather than close(x)? Or do you feel that close is a general enough as a concept to be considered the generic name for "finalize me"? It feels kind of I/O-specific to me. Keep in mind that we can just define finalize(io::IO) = close(io) in Base and then all you ever need to define for any IO object is open and close.

eschnett commented 8 years ago

Next step is an RC{T} class that wraps objects of type T, adding reference counting. This would be quite useful if the resource might be stored in a data structure or might be passed to another thread, but we still want early finalization. I'm thinking e.g. of large arrays or matrices for numerical calculations.

Although the implementation will be quite different, I hope that the syntax to use it can be made similar to the one discussed here. A macro might work:

dothework(@fin open("file"))
dothework(@rc open("file"))
StefanKarpinski commented 8 years ago

@tkelman: Given the potential ubiquity of this, a very lightweight syntax is essential. Since you don't like the syntax, please propose alternatives. This cannot be done with a function since it interacts with the surrounding syntax; if we had defer then it could be done with a macro generating defer. I very much like that f(x)! almost looks like f(x) since that's what you would write if you just let GC finalize x. Inserting a macro call would make this a lot less pleasant.

JeffBezanson commented 8 years ago

I kind of like the idea of combining finalize and close into one function, but it's no big deal.

Next step is an RC{T} class that wraps objects of type T, adding reference counting

-100. I would be amazed if there is any reasonable way to make that work. The right way to handle this case is to have the compiler insert speculative early-free calls.

tkelman commented 8 years ago

I don't mind the indentation of the do block form. I think readable and intuitive syntax should trump saving keystrokes especially for subtle sources of bugs like resource management. Defer with a macro would be easier to explain the rules for than a sigil handled at the parser level.

StefanKarpinski commented 8 years ago

I've watched a lot of people write code like this over and over:

f = open("file")
# do work
close(f)

I cannot get them to use the do-block form, even though I've explained many times that it's the right way to express this since it prevents unclosed file handles in case of exceptions. Of course, the file handles do eventually get closed on gc, so it's not dire, but in general, if people want to do something one way, it's better to make the thing they want to do work right rather than lecturing them about how some other way to do it is better. I doubt we'd have more luck with getting people to use the with form than the do block form. But I'm pretty optimistic that I could get the same people to just write f = open("file")! and omit the close(f) entirely. In fact, I think people would love this (it's way easier and requires less code), and I don't think that explaining what the ! means would be hard at all.

Longer term, this syntax would entirely eliminate the need for having do-block versions of all functions to do cleanup. That's a big win. But the real question is whether it's important enough to have its own syntax. I would argue that the pattern of doing setup, then doing work, then doing some cleanup (regardless of how the stack unwinds) is ubiquitous. It's also annoying to get right without syntactic support and as above, people usually just don't bother. So in my view, having syntactic support for doing this pattern correctly is a no-brainer. Whether the f(x)! syntax is the best one or not is another question, but I can't think of anything better. It makes some mnemonic sense too:

Makes sense to me. I suspect it will make sense to other people too and not be terribly hard to explain.

tkelman commented 8 years ago

f!(x) isn't syntax, it's a naming convention. I'd really prefer a named macro for this (@cleanup maybe?), otherwise I'm seeing myself having a hard time explaining to people who aren't familiar with manual resource management why exclamation points sometimes mean "modifies an input" and sometimes mean "cleans up when value goes out of scope." A single character is going to be easy to overlook when quickly reading or trying to debug library code.

Regardless of the syntax I do think it's worth prototyping the pieces of what the implementation would require.

edit: and now a prefix unary ! will start being used more frequently for logical / function negation, giving yet another meaning exclamation points might have

StefanKarpinski commented 8 years ago

f!(x) isn't syntax, it's a naming convention.

Is it? Somehow I'd missed that.

StefanKarpinski commented 8 years ago

We can go ahead with the general defer part of this and then play with syntax for auto-finalization.

femtotrader commented 8 years ago

A nice usage of "with" can be found in Python Yattag library http://www.yattag.org/ A similar Julia library will be a great use case see https://groups.google.com/forum/#!topic/julia-users/leNMURKreZo

StefanKarpinski commented 8 years ago

I think that use can already be handled as well or better with the existing do block syntax.

stustd commented 7 years ago

Why doesn't the following work?

redirect_stdout() do r
    show("Hello")
    # close(r[2])
    hello = readavailable(r[1])
    # close(r[1])
  end
vtjnash commented 7 years ago

because (a) there's nothing "available" on the socket until you close it (b) you can't close it until you've read what's available on it

you'll probably also get a deadlock on show sometimes too, since it (the operating system) also can't write to the socket until you've starting reading from it.

please use the discourse forum to ask questions, rather than hijacking issue threads on github

adambrewster commented 7 years ago

I went looking for the community's current position on finalizers and resource cleanup, and I found this.

The state seems to be that there are many options, and none of them are great. Package authors seem to solve the problem in many ways. I've seen:

A future language feature, with, might provide for finalizers to be called sooner. defer may also be added to allow the user to schedule finalizers at resource construction time. I don't see PRs for either of these solutions.

While this issue is a minor nuisance, it may reduce the quality of the code that is available in the package repository.

Here's my strawman proposal: https://github.com/adambrewster/Defer.jl/blob/master/src/Defer.jl.

Assistance from the compiler to generate the scopes automatically would be nice, but this gets somewhat close. It's also a similar syntax to what would be used when with or defer are implemented.

Thoughts?

StefanKarpinski commented 7 years ago

I think we should introduce this feature in 1.0 – getting it in 0.6 would have been nice, but there's only so much time. There's still a bit of controversy about the exact syntax, but we'll get there.

StefanKarpinski commented 7 years ago

A good example using my proposed syntax here is eachline:

for line in eachline("file.txt")!
    # do stuff with line
end # io object closed reliably on exit of for loop

Without the ! this works but the file remains open until it is GC'd. Using a do block or defer with a name is much more awkward:

open("file.txt") do io
    for line in eachline(io)!
        # do stuff with line
    end
end
io = open("file.txt") defer close(io)
for line in eachline(io)!
    # do stuff with line
end
# io closed when the enclosing scope is left
adambrewster commented 7 years ago

Happy to see this likely to make it into 1.0.

This version works without defining close(::EachLine):

@scope for line in eachline(@! open("file.txt"))
  # do stuff
end

Of course this might, too, depending on when the ! decides to close things.

for line in eachline(open("file.txt")!)
  # do stuff
end
yeesian commented 7 years ago

Going back to https://github.com/JuliaLang/julia/issues/7721#issuecomment-171345256, I have played around with the do block pattern, and it does get clunky very fast, see e.g. [1] or [2].

Also, it makes it difficult to step through the code at the REPL in an interactive session. But it does make it clear when the destroy()/close() methods will get called, which is useful in avoiding gotchas when handling complex interactions with remote resources.

defer statements improve the situation, but when there are lots of methods in the mix, it can be annoying to keep track of the specific methods to be deferred, and it'll be nice for a library to register them in advance for users.

StefanKarpinski commented 7 years ago

I think that having broad enough scopes and doing destruction in guaranteed reverse order should be sufficient. By broad enough scopes, I mean that we chose where a defer's scope ends carefully so that will be uncommon to want an object to outlive that scope. The biggest consideration should be looping behavior, which is the main reason you need to make sure a resource is finalized: if you open a file every time through a for or while loop, you will want to close it before the next loop iteration because if there are a lot of iterations, you'll need those resources. Of course, any resources should also be finalized before a function returns – partly because that just seems sensible, but also because the function may be called in a loop or recursively.

rapus95 commented 6 years ago

This one looks very neat, so I'm curious why it's not tagged for 1.0 or even 0.7 in order to keep track of it? (I'm pretty sure even those great coders among you can't remember every issue they've ever read) ;)

KristofferC commented 6 years ago

Because it is a feature and thus not a release blocker. The milestones are not for arbitrary tagging of stuff to remember them.

StefanKarpinski commented 6 years ago

As @KristofferC said, this is a non-breaking change and can be implemented in any 1.x release.

davidavdav commented 5 years ago

Hello, I wonder what the status of the review of the open() do end alternatives are.

I am OK with the current use, but I often run into cases where it would be nice to have a different solution:

StefanKarpinski commented 5 years ago

It would be great to have this functionality soon if anyone feels like taking a crack at implementing it.

adambrewster commented 5 years ago

I have updated my attempt at https://github.com/adambrewster/Defer.jl to be compatible with julia v1.0.

It's not ideal to do this with a package, but it does present an opportunity to iterate a bit before committing to a language feature.

stemann commented 5 years ago

Regarding, the slightly off-topic, reference counting of e.g. large arrays with deterministic disposal, RC{T}, mentioned by @eschnett: this need also came up in our development recently in the context of multiple cooperating tasks, with one large-array-producing task and multiple consumers. We wrapped up the reference counting in a (quite basic) package: https://github.com/IHPSystems/ResourcePools.jl

aminya commented 4 years ago

Transferred from https://github.com/JuliaLang/julia/issues/35815:

For performance-critical and real-time applications such as Control Systems, Robotics, Automotive, Audio VST, etc, having a deterministic memory management approach is necessary.

I want to propose an optional memory management feature that allows using Julia without garbage collection. Users can disable the garbage collection and just use this system. Otherwise, GC can help in all of these cases. This should be an optional feature that is added on top of the current behavior and should be fully backward compatible.

This is just an initial idea, so let me know your suggestions.

Named Scope:

Var{scope_name}(definition) and Scope{scope_name}(code):

The variables defined using Var{scope_name}(definition) should only exist in that scope. For example,

function fun()

    Scope{:Foo} begin

        # this keeps x in the memory only inside the :Foo scope:
        x = Var{:Foo}( rand(3) )

    end
    # x will be deleted once the :Foo scope is finished

    return nothing
end

fun()

The scope name allows us to let a variable escape other scopes with different names.

function fun2()

    Scope{:Foo} begin

        Scope{:Bar} begin
            x = Var{:Foo}( rand(3) )
            y = Var{:Bar}( rand(4) )
        end
        # `x` will escape this scope
        # `y` will be removed

    end
    # `x` will be removed here

    return nothing
end

fun2()

One can use Scope{Any}, which means everything inside them will be removed regardless of the variables' scope names.

function fun3()

    Scope{Any} begin
        x = Var{:Foo}( rand(3) )
        y = Var{:Bar}( rand(4) )
    end
    # everything should be removed.

end

Scope{:Foo} and Var{:Foo} should be inside the same module. This means all the scoping information is gone outside a module. But this allows deferring the removal of a variable until the scope is called (either from the top level of the module or from inside another function of the module)

module ModuleA
function fun()
   x = Var{:Foo}( rand(3) )
   return x
end

Scope{:Foo} begin
    xout = fun()
end
# xout will be removed here

Local variables:

When the scope is not specified, it should be considered as local and the variable should be removed once the program exits that local scope (unless returned). Such a variable should escape all the named scopes.

This syntax can be simplified by removing the need for using Var since this is obvious and we don't need an extra Var.

function fun4(z)

    # This keeps x and y in the memory only inside the `fun` function

    # the simplified version
    y = rand(3)

    # or more explicit
    x = Var( rand(3) )      

    # x and y will be removed here.
        # z is from outside, so it should not be removed.
    return z
end
function fun4()

    Scope{:Foo} begin
        y = rand(3) 
    end
    # y will escape because it does not have a named scope  

    return y
end

You can think of the local scope of a function as Scope{:fun_local}.

So:

Generic programming

Because all of the scoping information is gone outside that module, the caller in another module should take responsibility for the memory management of the returns. So they should specify the scope (or delete things manually).

using SomePackage # exports fun

Scope{:Foo} begin
  # we specify the scope of the return of `fun()`
  myx = Var{:Foo}( fun() )
end

Global variables:

Variables that are defined using global should be deleted manually by the user. This is the only place where we need to call delete directly.

function fun()
    global x = rand(3)
    return x
end

xout = fun()
delete xout

Scope Propagation

We can decide between two options. With propagation addresses the issue when references are passed to other objects.

1) With propagation: scope propagates in the operations unless it is explicitly overwritten by the user. The variable returned from a function applied to the arguments with different scopes will have the outer scope. When one object is stored in another object with different scope, the outer scope should be chosen for both. If one is from outside the other one should live until the outer one lives.

function fun5()
    Scope{:Foo} begin
        a = Var{:Foo}( rand(3) )
        b = a       # a is also a `Var{:Foo}` now.
        c = a + 1   # c is a `Var{:fun5_local}` now (because 1 was local)

        Scope{:Bar} begin
            # two different scopes
            d = Var{:Bar}( rand(3) )
            e = d .+ a      # e is considered Var{:Foo} 
        end
        # d and e are removed here

    end
    # a, b are removed here
    return nothing
        # c is removed here
end
2) `No propagation`: based on people's feedback, this option may not work in some situations. 2) `No propagation`: scope does not propagate and should be explicitly specified for each new variable. The exception can be the raw assignment (`d=a`). ```jl function fun5() Scope{:Foo} begin a = Var{:Foo}( rand(3) ) b = Var{:Foo}( a + 1 ) # b is a `Var{:Foo}`. c = a + 1 # c is not a `Var{:Foo}` (it is local). # assignment exception d = a# d is a `Var{:Foo}` end # a, b, d are removed here # c escapes return c end ```

Alternative syntax

We can use a macro-like syntax instead, while I prefer the above one (seems cleaner).

for definition @var(scope_name, definition) and for scope @scope(scope_name, code):

@scope :Foo begin
    @var :Foo rand(3)
end

References

Inspired by https://en.cppreference.com/w/cpp/language/raii, https://doc.rust-lang.org/rust-by-example/scope/raii.html, and https://www.beeflang.org/docs/language-guide/memory/

tkf commented 4 years ago

This is interesting! How does it handle higher-order functions and recursions?

function f!(y, n = 3)
    n == 0 && return y
    Scope{:Foo} begin
        x = Var{:Foo}(rand(3))
        y .+= f!(x, n - 1)
    end
    return y  # is `y` alive here?
end
aminya commented 4 years ago

This is interesting! How does it handle higher-order functions and recursions?

My general idea is that all of this should be decided in the compile-time (that is why I prefer no propagation). So the compiler should be able to detect the scopes and the variables that should be freed. These two cases are detectable by the rules that are set for the compiler.

function f!(y, n = 3)
    n == 0 && return y 
    # y is returned (also coming from outside), so it is not freed.
    # n is from outside so it is not freed

    Scope{:Foo} begin
        x = Var{:Foo}(rand(3))
        y = y .+ f!(x, n-1)     # the resulting y is local (no propagation)
    end
    # x is removed
    # y escapes

    # if we did not return y, it should have been freed
    return y
end
tkf commented 4 years ago

I was looking at "Named Scope" section (rather than "Scope Propagation"):

function fun()
   x = Var{:Foo}( rand(3) )
   return x
end

Scope{:Foo} begin
    xout = fun()
end
# xout will be removed here

From this example, it looks like the caller can specify Scope{:Foo}? Then what happens when the caller and callee are the same function (= recursion)?

Maybe I better ask:

Scope{:Foo} begin
    Scope{:Foo} begin
        ...
    end  # is this `end` a no-op?
end
aminya commented 4 years ago

From this example, it looks like the caller can specify Scope{:Foo}? Then what happens when the caller and callee are the same function (= recursion)?

I don't think of recursion as a special case. For recursion (similar to all the other situations), each function call should be processed separately.

More generally: Each scope has some entry points and some exit points. Each scope will be treated independently.

Maybe I better ask:

Scope{:Foo} begin
    Scope{:Foo} begin
        ...
    end  # is this `end` a no-op?
end

If someone writes this directly:

Scope{:Foo} begin
    Scope{:Foo} begin
        ...
    end  
    # anything with :Foo scope is removed here
end
# nothing happens

If you want to unwrap the recursion, you should also include each function's local scope: Scope{:fun_local}

Scope{:fun_local} begin
    Scope{:Foo} begin
        ...
    end 
end