JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.57k stars 5.47k forks source link

Redesign remote exception handling, collection and display #20277

Open amitmurthy opened 7 years ago

amitmurthy commented 7 years ago

Currently we have 3 different exception container types, i.e., types that wrap a basic exception.

RemoteException - remotecalls use this to differentiate between exceptions to be rethrown at the caller and those that are to be returned as data. CapturedException - captures a backtrace for transport across process boundaries CompositeException - collects multiple exceptions caused by a single call, for example an @everywhere where one or more tasks may throw exceptions

Multiple tasks are involved in a single remotecall.

Exceptions may be raised in any of these tasks. Currently we collect stacktraces in each task and possibly different exceptions and throw them to the caller.

In most cases, the current implementation leads to a Matryoshka doll type of exceptions nested in the various exception container types.

For example, one of the test cases needs to extract the root exception with a statement like ex.captured.ex.exceptions[2].ex

This makes proper exception handling in the context of remote calls quite difficult.

Design work on a sane exception propagation mechanism for remote exceptions followed by its implementation is required.

StefanKarpinski commented 7 years ago

@amitmurthy, it seems like the distributed stuff will still need a fair bit of potentially-breaking design work post-1.0. Would you be willing to take a crack at extracting it from Base so that we can have it as a standard package, which would allow us to continue to iterate on it after 1.0 is released?

amitmurthy commented 7 years ago

I'll respond with a longer, detailed comment later, but the short answer is that I feel we should not move it out of Base just yet. At least till the time we don't ship a set of standard packages along with the Base distribution.

StefanKarpinski commented 7 years ago

We're definitely going to be shipping a set of standard packages with Julia 1.0 – this would be one.

amitmurthy commented 7 years ago

Off the top of my head, the following are the work items in order to keep the interface similar to what we current have.

  1. Command line argument processing.
    • Move options -L, -p, --machinefile and --worker options to Distributed package
      • We may need command line support in Base to
        • specify list of packages to be loaded on startup
        • specify and pass command line arguments to a module
      • Erlang/OTP has something similar to this. See http://erlang.org/doc/man/erl.html (options -Application Par Val, -run Mod [Func [Arg1, Arg2, ...]),
  2. This will also trigger a reworking of the ClusterManager interface
  3. Use of unexported functions from Base
    • For each such instance
      • decide if it is worthwhile to export it or
      • check if there an alternative or
      • use it with an explicit Base.
  4. Move SharedArray into its own package
  5. Will be good to have minimum support for standard packages before we start moving stuff out.
    • How are they organized, built and distributed?
  6. Testing
    • Would need to explicitly load Distributed
    • Separate testing infra for standard packages
StefanKarpinski commented 7 years ago

Move options -L, -p, --machinefile and --worker options to Distributed package

  • We may need command line support in Base to
    • specify list of packages to be loaded on startup
    • specify and pass command line arguments to a module

Relevant to this is https://github.com/JuliaLang/julia/issues/20293 since it would need support for a similar ability to load packages and pass them options from the command line.

JeffBezanson commented 6 years ago

No activity in 5 months and moved to stdlib, so removing from milestone.

samoconnor commented 6 years ago

I wonder if there is some way that the @sync API could rethrow all the exceptions instead of rethrowing a single CompositeException. This implies having multiple live exceptions "in the air" at the same time. In the absence of catch statements, the top-level handler (or the REPL) would log/print each error message in turn. Perhaps catch statements could simply be executed multiple times in the presence of multiple exceptions. e.g.

julia> @sync begin
           @async error("Foo")
           @async error("Bar")
       end
ERROR: Foo
error at ...

ERROR: Bar
error at ...
julia> try @sync begin
           @async error("Foo")
           @async error("Bar")
       end catch e
            @show e
            7
       end
e = ErrorException("Foo")
e = ErrorException("Bar")
(7, 7)
julia> try @sync begin
           @async error("Foo")
           @async error("Bar")
       end catch e
            @show e
            if e.msg == "Foo"
               rethrow(e)
            end
            7
       end
e = ErrorException("Foo")
ERROR: Foo
julia> try @sync begin
           @async error("Foo")
           @async error("Bar")
       end catch e
            @show e
            if e.msg == "Bar"
               rethrow(e)
            end
            7
       end
e = ErrorException("Foo")
e = ErrorException("Bar")
ERROR: Bar

The implementation could be to keep the CompositeException collection mechanism as is, but handle it specially in catch blocks.

samoconnor commented 6 years ago

Another option would be to have an optional on_error::Function argument to @sync, used with the do ... syntax, and called for each error in turn.

julia> @sync begin
           @async error("Foo")
           @async error("Bar")
       end do e
            @show e
            if e.msg == "Bar"
               rethrow(e)
            end
            7
       end
e = ErrorException("Foo")
e = ErrorException("Bar")
ERROR: Bar