trifork / erjang

A JVM-based Erlang VM
http://www.erjang.org
Apache License 2.0
725 stars 62 forks source link

Process state vulnerable to race conditions #75

Open eriksoe opened 10 years ago

eriksoe commented 10 years ago

As it stands, the process state is insufficiently thread-safe. We need a clear set of invariants, and we need to enforce them.

The set of invariants should aparently include both pstate, the exit hooks, and the pid-to-process binding.

One example of a race condition: simultaneous updates of EProc.pstate:

Interleaving: ...,B2,A1,B3 => P1 ends up having terminated normally but with pstate==EXIT_SIG.

Another example: simultaneous updates of EProc.pstate:

Interleaving: ...,A2,B3,A3 => P1 ends up having overlooked the exit signal.

Example involving exit hooks:

Interleaving: ...,A2,...,B1,...,A4,...,B3 => Exit hook is never called.

eriksoe commented 10 years ago

Demo of the exit-hook problem:

flood(N) ->
    Before = ets:all(),
    flood_loop(N),
    timer:sleep(1000),
    After = ets:all(),
    After -- Before.

flood_loop(0) -> ok;
flood_loop(N) when N>0 ->
    Pid = spawn(fun() -> ok end),
    %Tab = ets:new(foo, [{heir, Pid, here_you_are}]),
    Tab = ets:new(foo, []),
    try ets:give_away(Tab, Pid, here_you_are)
    catch _:badarg -> ets:delete(Tab)
    end,
    flood_loop(N-1).

flood(1000) returns [] on Erlang, as expected, but (often) a non-empty list on Erjang.

(As a bonus, the code triggers this race bug: java.lang.NullPointerException at erjang.EInternalPID.is_alive(EInternalPID.java): return task != null && task.is_alive(); when task is set to null by another thread.)

eriksoe commented 10 years ago

Process state rework is now done - it is in branch 'process-lifecycle-consistency', for the time being, pending review.

krestenkrab commented 10 years ago

Is this the fix in #77 ?