uwiger / gproc

Extended process registry for Erlang
Apache License 2.0
1.07k stars 232 forks source link

Wait for process to exit and/or name to be available #53

Closed benbro closed 10 years ago

benbro commented 10 years ago

Hi,

In my server each user has a session pid with a unique gproc name - userid. Each userid should have a single session. When a user tries to create a new session, the old pid should be terminated and the.

I couldn't find an easy way to do it with the gproc API. Is it possible to add something like: gproc:add_local_name_when_available(Name) (maybe with a nicer function name:)

I tried to think of a way to do it, and it's hard to get it right. First we need to check if name is not taken. If it's taken, monitor the pid and wait until the name is free. We should make sure that there isn't a race condition and the old pid terminates before we start monitoring it. When we get a message that the name is available, register it and return. Again, there might be a race condition if another pid register the name.

Another complication is when several pids are trying to use the same name at the same time. Maybe a timeout can solve this.

Do you think it'll be useful?

Thanks

uwiger commented 10 years ago

Let me ponder this. It might be possible to add something to gproc, but in the meantime, you could at least try out the gproc:monitor() functionality. It will give you notifications when a name is unregistered. You can, of course also monitor the pid...

One way to deal with the race condition is to use gproc:reg_or_locate(), which either grabs the name or returns the pid of the process that got there first.

benbro commented 10 years ago

Using gproc:reg_or_locate/1 with monitor/2 works great. monitor is smart enough to prevent a race condition.

Pid = self(),
case gproc:reg_or_locate({n, l, {?MODULE, UserID}}) of
    {Pid, undefined} ->
        ok;
    {OldPid, undefined} ->
        monitor(process, OldPid),
        stop_session(OldPid),
        receive
            {'DOWN',_, process, OldPid, _} ->
                gproc:reg({n, l, {?MODULE, UserID}}),
                ok
        after
            1000 ->
                error
        end
end.

Do you want me to close this issue or leave it open?

Thanks

uwiger commented 10 years ago

I'm running some experiments. Let's leave it open to remind me. I'll close it later.

I seems to me that your code above still has some lurking race conditions in it, but perhaps I don't understand the full use case?

benbro commented 10 years ago

You mean that I might try to call gproc:reg({n, l, {?MODULE, UserID}}) before gproc unregistered the old pid name because both of us are getting the down message at the same time? If I'll use gproc:monitor/1 it will prevent it?

uwiger commented 10 years ago

No, but if two processes run the same code concurrently, both may tell the OldPid to stop (or one process stops OldPid, gets the name, and is immediately stopped by another); if both tell OldPid to stop, and receive the unreg, one will crash when calling gproc:reg().

This may not be an issue in your case, but your use case triggered an itch, so I'm experimenting. ;-)

uwiger commented 10 years ago

You might want to take a look at the branch uw-standby-monitors https://github.com/uwiger/gproc/tree/uw-standby-monitors

See specifically https://github.com/uwiger/gproc/blob/uw-standby-monitors/doc/gproc.md#monitor-2

benbro commented 10 years ago

The API looks great and solves the general case of multiple concurrent calls to the same name. I'll test it and report back.

Thanks

benbro commented 10 years ago

I'm getting an exception:

gproc:monitor({n, l, testname}, standby).
** {{case_clause,[]},
    [{gproc,handle_call,3,[{file,"src/gproc.erl"},{line,2080}]},
     {gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,585}]},
     {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,239}]}]}

This is how I'm trying to test the new monitor with standby:

-module(monitor_test).

-export([spawn1/0, spawn2/0, spawn3/0, stop/1]).

spawn1() ->
    spawn(?MODULE, loop, [1]).

spawn2() ->
    spawn(?MODULE, loop, [2]).

spawn3() ->
    spawn(?MODULE, loop, [3]).

stop(Pid) ->
    Pid ! stop.

loop(Index) ->
    Key = {n, l, testname},
    gproc:monitor(Key, standby),
    receive
        stop ->
            io:format("index ~p stopped~n", [Index]),
            ok;
        Msg ->
            io:format("index ~p got ~p~n", [Index, Msg]),
            loop(Index)
    end.

Test:

Pid1 = spawn1().
Pid2 = spawn2().
Pid3 = spawn3().
stop(Pid1).
stop(Pid2).

Thanks

uwiger commented 10 years ago

Thanks for the test code. I pushed some changes, but have yet to add test cases in the local case.

benbro commented 10 years ago

I've updated the test code above. When calling stop(Pid1), Pid2 registers testname and both Pid2 and Pid3 receive the failover message as expected.

When calling stop(Pid2), Pid3 doesn't get a failover or any other message. Does the monitor of Pid3 gets removed when it receives the failover message? Do I need to call gproc:monitor(Key, standby) again when receiving a failover message?

uwiger commented 10 years ago

You shouldn't have to call gproc:monitor(Key, standby) again, as the monitors, value and other properties are all transfered.

One thing to think about concerning your test code is that it's fully asynchronous. It's possible that Pid3 doesn't even have time to start until Pid2 has both inherited the key and died.

benbro commented 10 years ago

I'm calling the functions from the terminal one by one so if everything is transferred, it should work. Please see the attached terminal commands and output:

1>Pid1 = monitor_test:spawn1().
index 1 got {gproc,{failover,<0.464.0>},#Ref<0.0.0.1765>,{n,l,testname}}
<0.464.0>
2> Pid2 = monitor_test:spawn2().
<0.471.0>
3> Pid3 = monitor_test:spawn3().
<0.473.0>
4> monitor_test:stop(Pid1).
index 1 stopped
index 3 got {gproc,{failover,<0.471.0>},#Ref<0.0.0.2060>,{n,l,testname}}
index 2 got {gproc,{failover,<0.471.0>},#Ref<0.0.0.2053>,{n,l,testname}}
stop
5> monitor_test:stop(Pid2).
index 2 stopped
stop
uwiger commented 10 years ago

Ah, I see what's going on!

In your test code, you call monitor() each time you loop, and it's the monitor() implementation that does the wrong thing when the owner of the name monitors itself!

I will fix that.

uwiger commented 10 years ago

A new push, now also with gproc:monitor(Key, follow), which keeps monitoring the key even if it is unregistered. I've added test cases to the local and distributed test suites.

benbro commented 10 years ago

Am I supposed to be able to call gproc:monitor(Key, follow) with the first Pid and replace standby in the test code above with follow? It gives me:

1> Pid1 = monitor_test:spawn1().
index 1 got {gproc,unreg,#Ref<0.0.0.1765>,{n,l,testname}}
<0.464.0>

I keep getting the unreg message in the loop.

uwiger commented 10 years ago

If you call gproc:monitor(Key, follow) before registering the name, you will always get an 'unreg' message first.

You should probably use standby for your use case. I added 'follow' partly because it was a handy thing to do in the process running the test. :)

benbro commented 10 years ago

This is an awesome feature.

Thanks