Closed sfmatt closed 10 years ago
My apologies for the false alarm. For some reason the opam-installed version of parmap was shadowed by an older version in which the problem was not yet corrected.
Also regarding issue #18: Fatal error: exception Failure("input_value_from_block: bad object"), it's caused by one of the children processes aborting abruptly (in my case an unexpected nan value in some complex computation causes the process to abort without warning/stack trace).
Dear Matt, thanks for auto-fixing this :-)
Let me remark here that up to now parmap does not handle gracefully abnormal termination of one of the workers, as in #18.
It would require some work to make exceptions in the workers come up as exceptions in the main program, and we did not do this yet, but contributions are welcome
On Mon, Oct 27, 2014 at 10:42:38PM -0700, sfmatt wrote:
My apologies for the false alarm. For some reason the opam-installed version of parmap was shadowed by an older version in which the problem was not yet corrected.
Also regarding issue #18: Fatal error: exception Failure ("input_value_from_block: bad object"), it's caused by one of the children processes aborting abruptly (in my case an unexpected nan value in some complex computation causes the process to abort without warning/stack trace).
— Reply to this email directly or view it on GitHub.*
Alas Roberto I'm a decent debugger but a poor programmer unfortunately. The best I can do to contribute is to give you 2 simple programs to reproduce the exceptions in the latest parmap version:
_Fatal error: exception End_offile: let l = [1;2;3] let f x = exit 0 let l' = Parmap.(parmap ~ncores:2 ~chunksize:1 f (L l)) let () = List.hd l' |> print_int
_Fatal error: exception Failure("input_value_fromblock: bad object"): let l = [1;2] let f x = exit 0 let l' = Parmap.(parmap ~ncores:2 ~chunksize:1 f (L l)) let () = List.hd l' |> print_int
As you can see there are no exceptions involved in the workers only an exit call. In both examples if we replace let f x = exit 0 with let f x = failwith "FAIL" we get an explicit error message: [Parmap]: error at index j=0 in (0,0), chunksize=1 of a total of 1 got exception Failure("FAILED") on core 0 [Parmap]: error at index j=0 in (1,1), chunksize=1 of a total of 1 got exception Failure("FAILED") on core 1 [Parmap]: aborting due to exception on core 0: Failure("FAILED")
IMHO parmap deals perfectly fine with exceptions in the workers as it is. Now perhaps the same (or very similar) error messages could be used in the exit 0 scenario(s)?
Thank you again for parmap!
Matt
On Tue, Oct 28, 2014 at 12:43 AM, Roberto Di Cosmo <notifications@github.com
wrote:
Dear Matt, thanks for auto-fixing this :-)
Let me remark here that up to now parmap does not handle gracefully abnormal termination of one of the workers, as in #18.
It would require some work to make exceptions in the workers come up as exceptions in the main program, and we did not do this yet, but contributions are welcome
On Mon, Oct 27, 2014 at 10:42:38PM -0700, sfmatt wrote:
My apologies for the false alarm. For some reason the opam-installed version of parmap was shadowed by an older version in which the problem was not yet corrected.
Also regarding issue #18: Fatal error: exception Failure ("input_value_from_block: bad object"), it's caused by one of the children processes aborting abruptly (in my case an unexpected nan value in some complex computation causes the process to abort without warning/stack trace).
— Reply to this email directly or view it on GitHub.*
— Reply to this email directly or view it on GitHub https://github.com/rdicosmo/parmap/issues/29#issuecomment-60719385.
Hi Roberto,
Thanks a lot for parmap which I was using successfully until recently. With v1.0-rc5 array_float_parmap returns the above exception for a source array of ~90K elements, even with ncores = 1. There does not seem to be any memory issue as around half of the computer's memory is free when the exception is raised. This is on Ubuntu 14.04 64bits btw.
Matt