Open tcoopman opened 5 years ago
It happens to me too, but without any "exit", just by making a process send a message to another process:
open Reactor.System;
Reactor.Node.(Policy.default() |> setup);
let printer = spawn((ctx, _) => `Become(0), 0);
let _ =
spawn(
(ctx, state) => {
printer <- "hello";
`Become(state);
},
0,
);
Reactor.Node.run();
It also looks like there is some kind of race condition here. I'm just running the following code:
Fmt_tty.setup_std_outputs();
Logs.set_level(Some(Logs.Debug));
Logs.set_reporter(Logs_fmt.reporter());
open Reactor.System;
Reactor.Node.(Policy.default() |> setup);
let _ = spawn((ctx, state) => exit(), 0);
Reactor.Node.run();
There are times it exits successfully, and the tail looks like:
TestFrameworkApp.exe: [DEBUG] [26298] Tasks queue has 0 tasks
TestFrameworkApp.exe: [DEBUG] [26299] Receiving tasks...
TestFrameworkApp.exe: [DEBUG] [26297] Receiving tasks...
TestFrameworkApp.exe: [DEBUG] [26295] Receiving tasks...
TestFrameworkApp.exe: [DEBUG] [26298] Receiving tasks...
TestFrameworkApp.exe: [DEBUG] [26299] Handling tasks...
TestFrameworkApp.exe: [DEBUG] [26295] Handling tasks...
TestFrameworkApp.exe: [DEBUG] [26297] Handling tasks...
TestFrameworkApp.exe: [DEBUG] [26298] Handling tasks...
TestFrameworkApp.exe: [INFO] [26291] Node shutting down...
However, when it fails, it looks something like:
TestFrameworkApp.exe: [DEBUG] [26198] Tasks queue has 0 tasks
TestFrameworkApp.exe: [DEBUG] [26199] Tasks queue has 0 tasks
TestFrameworkApp.exe: [DEBUG] [26198] Receiving tasks...
TestFrameworkApp.exe: [DEBUG] [26199] Tasks queue has 0 tasks
TestFrameworkApp.exe: [DEBUG] [26198] Handling tasks...
TestFrameworkApp.exe: [DEBUG] [26192] Tasks queue has 0 tasks
TestFrameworkApp.exe: [DEBUG] [26199] Receiving tasks...
TestFrameworkApp.exe: [DEBUG] [26192] Receiving tasks...
TestFrameworkApp.exe: [DEBUG] [26199] Handling tasks...
TestFrameworkApp.exe: [DEBUG] [26192] Handling tasks...
TestFrameworkApp.exe: [INFO] [26199] Scheduler shutting down...
TestFrameworkApp.exe: [INFO] [26189] Node shutting down...
TestFrameworkApp.exe: [INFO] [26201] Beginning scheduler loop...
TestFrameworkApp.exe: [DEBUG] [26201] Tasks queue has 0 tasks
TestFrameworkApp.exe: [DEBUG] [26201] Receiving tasks...
TestFrameworkApp.exe: [ERROR] [26201] Uncaught exception in scheduler: Failure("Marshal.data_size: bad object")
TestFrameworkApp.exe: [INFO] [26200] Beginning scheduler loop...
TestFrameworkApp.exe: [DEBUG] [26200] Tasks queue has 0 tasks
TestFrameworkApp.exe: [DEBUG] [26200] Receiving tasks...
TestFrameworkApp.exe: [ERROR] [26200] Uncaught exception in scheduler: Failure("Marshal.data_size: bad object")
So it looks like after getting the Node shutting down
the scheduler keeps listening for messages; maybe this is the problem?
Maybe that's related to Process.kill
? https://github.com/ostera/reactor/blob/9f3f7486e142150828ff20e1ec79e367dbaa8adb/src/platform/process.re#L65
I guess that sending a sigkill
doesn't wait for the result (which is killing). Maybe we should wait for that?
@Schniz good idea. I believe the bug may be related to the pipe used across processes being removed after one of them die, and then the other one can't really read a full command from it and blow up.
I'm looking into fixes in this PR: https://github.com/ostera/reactor/pull/17
@tcoopman @Schniz could you help me out testing the current master? I merged #17 and it's looking good on my end but I want to verify if this issue persists.
Sometimes at the end of a run (I guess after an
exit
) this message appears on the console:Fatal error: exception Failure("Marshal.data_size: bad object")
.This doesn't happen every time.
OS: Linux/Archlinux.