Proposal for DynamicSupervisor

elixir-lang / gen_stage

Producer and consumer actors with back-pressure for Elixir

http://hexdocs.pm/gen_stage

1.52k stars 192 forks source link

Proposal for DynamicSupervisor #10

Closed josevalim closed 8 years ago

josevalim commented 8 years ago

A DynamicSupervisor is a supervisor designed to supervise and manage many children dynamically.

It is a spawn-off of the :simple_one_for_one strategy found in the regular Supervisor.

We have a couple goals by introducing a dynamic supervisor:

Simplify the API and usage of both Supervisor modules. Most of the documentation in the Supervisor module is full of conditionals: "if the supervisor type is :simple_one_for_one, it will behave as X, otheriwse as Y." The differences in behaviour with little surrounding context makes supervisors hard to learn, understand and use;
Provide out-of-the-box supervisor sharding for cases where the supervisor itself may be a scalability concern;
Provide a built-in registry to avoid developers unecessarily using dependencies like gproc or incorrect dependencies like global;
Implement the GenStage specification so dynamic supervisors can subscribe to producers and spawn children dynamically based on demand;

The first bullet is about is about implementing a DynamicSupervisor module with the same API and functionality as a :simple_one_for_one Supervisor. That's relatively straight-forward to do and therefore we will focus on the other functionality for the rest of this proposal.

Shards

The DynamicSupervisor is going to provide automatic sharding. Imagine the following start_link call:

DynamicSupervisor.start_link(MySupervisor, args, [])

it will start a single supervisor with the specification defined by MySupervisor. By passing the :shards option, the DynamicSupervisor will start N supervisors (let's call them shards) under the parent supervisor with the specification defined by MySupervisor:

DynamicSupervisor.start_link(MySupervisor, args, [shards: 3])

In other words, a regular dynamic supervisor will look like:

      /-- child1
     /--- child2
[sup] --- ...
     \--- childy
      \-- childz

With shards, we have:

                           /-- child1
                          /--- child2
          /--------[shard] --- ...
         /                \--- childy
        /                  \-- childz
       /
      /                    /-- child1
     /                    /--- child2
[sup]--------------[shard] --- ...
     \                    \--- childy
      \                    \-- childz
       \
        \                  /-- child1
         \                /--- child2
          \--------[shard] --- ...
                          \--- childy
                           \-- childz

Those N shards will write to the same ETS table. The supervisor will redirect commands like start_child to one of the shards (probably by using a consistent hashing algorithm) while commands like which_children/1 and count_children/1 will read from the ETS table and return correct results.

The :shards option require a positive integer or :schedulers as value. If :schedulers is given, the number of shards started will be the same as the amount of schedulers online.

Registry

The supervisor will also work as a registry by starting it with the registry option:

DynamicSupervisor.start_link(MySupervisor, args, [registry: MySupervisor, name: MySupervisor])

Note: although not strictly required, we recommend the registry name to be the same name as the supervisor name.

Besides the start_child/2 function, start_child/3 will also be added, which allows a process to be started with a given id:

DynamicSupervisor.start_child(MySupervisor, "hello", args)

That will start a new child with id of "hello". Registry lookups are done with the {:via, ..., ...} option:

location = {:via, DynamicSupervisor, {:id, MySupervisor, "hello"}}
GenServer.call(location, :perform_action)

Sharded registry

The registry and shards feature can be used together, which means all shards will be written to the same registry. Furthermore, the registry itself can be used to lookup for a particular shard:

location = {:via, DynamicSupervisor, {:shard, MySupervisor, 0}}
DynamicSupervisor.start_child(location, "hello", args)

This will start a child in the supervisor at shard 0 with ID hello, completely bypassing the main supervisor in the shard case.

DynamicSupervisor as consumer

Finally, the DynamicSupervisor can be used as a consumer in a GenStage pipeline. In such cases, the supervisor will be able to send demand upstream and receive events. Every time an event is received, a child will be started for that supervisor. In order to provide such feature, the supervisor init/1 may return the same options as a GenStage's init/1 would:

def init(arg) do
  GenStage.async_subscription(self(), SomeProducer)

  children = [
    worker(MyWorker, [])
  ]

  {:ok, children, max_demand: 100, min_demand: 50}
end

In case of a sharded supervisor, the supervisor will work as a proxy to all shards. Every time it is asked for the supervisor to subscribe to a given producer, it will redirect the subscription request to all shards (and it will persist such in case they crash, forcing them to subscribe even in case they restart).

MSch commented 8 years ago

Those N shards will write to the same ETS table.

AFAIK a major point of sidejob is that each shard has its own ETS table, since ETS tables are always single-writer. If each shard writes to the same ETS table then won't they all be blocked waiting for the lock on the ETS table?

josevalim commented 8 years ago

That's a great point. We can have the parent supervisor owning the tables and handing it to children, each with its own. The downside is that operations like which_children and count_children would need to go through all ETS tables but that's better than a write contention on every start_child. The registry, of course, would still be shared. We would need to benchmark those two approaches nonetheless.

josevalim commented 8 years ago

@MSch are you using the overload feature from side_jobs? Should we support a :max_children option?

fishcakez commented 8 years ago

ETS supports multiple writers with {:write_concurrency, true}. This splits the write locks into partitions in the table. Rather I think the main trick with sidejob is that the first key (and table) is chosen by scheduler id to reduce lock contention. That's not to say we shouldn't benchmark to see which performs better.

MSch commented 8 years ago

This splits the write locks into partitions in the table.

Right, I totally forgot about that. :/

But sidejob still went with individual ETS tables because it's not possible to control how the table is partitioned, so even though there will be multiple write locks two shards might still block on the same partition, see the commit message here: https://github.com/basho/sidejob/commit/467a7c070b458c13924c023ec2bd4e9237f2acdd

are you using the overload feature from side_jobs? Should we support a :max_children option?

Yes please, that is the primary reason why we are using sidejob. Having better scalability is a nice addition.

Thinking a bit more, why use an ETS table per shard instead of just keeping a map in each shard's state, if there's a second (shared) ETS table for registration and the per-shard ETS table is just used to keep track of the children? Wouldn't this be something that should be benchmarked?

Here's what sidejob does: Each shard keeps track of the child processes using a per-process :sets set, and only periodically publishes the current usage and whether it can accept new children to its public ETS table.

One more thing:

The supervisor will redirect commands like start_child to one of the shards (probably by using a consistent hashing algorithm)

At least if there's one shard per scheduler, I think it would be preferably to have a mapping from scheduler id to shard (that's what sidejob does) than to consistently hash the name under which the process should be registered. Then it's always possible to bypass the main supervisor, and only if the DynamicSupervisor also acts as registry is there a shared bottleneck (the shared ETS table)

Furthermore, how would DynamicSupervisor.start_child(MySupervisor, "hello", args) and GenServer's :name option interact?

josevalim commented 8 years ago

Thanks @MSch. I am not sure if side_jobs works like this but the whole idea of sharding is that you still get all Supervisor functions. The shards are an implementation detail. That said...

Thinking a bit more, why use an ETS table per shard instead of just keeping a map in each shard's state, if there's a second (shared) ETS table for registration and the per-shard ETS table is just used to keep track of the children? Wouldn't this be something that should be benchmarked?

We could try this as well. We need the shared table though so we can implement which_children and count_children without traversing all child supervisors.

At least if there's one shard per scheduler, I think it would be preferably to have a mapping from scheduler id to shard (that's what sidejob does) than to consistently hash the name under which the process should be registered.

You will be able to achieve it by using a registry with via:

shard = :erlang.phash2(self(), :erlang.system_info(:schedulers))
location = {:via, DynamicSupervisor, {:shard, MySupervisor, shard}}
DynamicSupervisor.start_child(location, args)

But because we still want DynamicSupervisor.start_child(pid, args) to work, my previous sentence describes the behaviour of when you don't go straight to the shard.

sasa1977 commented 8 years ago

There are some interesting ideas here. I'm particularly happy to see that a possibility of the process registry is being discussed. IMO, this is something I definitely feel should be provided out of the box. With that in mind, here are some comments and alternative proposals.

Regarding one vs multiple ETS table, I can see this swing in both directions, depending on measurings. Assuming the table type is not ordered_set, I'd expect {write_concurrency, true} to work reasonably well, at least on smaller number of cores, where only a few processes at the same time can in fact work on the table, and the operations are quite fast. If the system does something else besides creating processes, I wouldn't expect writers on the same tables block each other all that frequently. On the other hand, IIRC ETS had some scaling issues with many cores. Perhaps that has been fixed, but if not, I feel using multiple tables might make more sense. Ultimately it should be measured, both for perf and mem usage, since there's some cost associated with ETS tables (768 words), and a default limit is quite small (1400).

All that being said, I wonder what's the purpose of supporting which_children and count_children for dynamic supervisors. The cases where I'd likely want to use that feature are in non-dynamic supervisors where I'd want to fetch a particular child of a particular supervisor. Thus, I wonder if that could be pushed to the registry. In such case, dynamic supervisor remains extremely focused on creating children only. This should yield better performance for those who need to create a lot of anonymous children frequently, while those who need to label or enumerate their children can go through the registry (more on that in a bit).

Before moving to discussing the registry, one final question how would max_restarts and max_seconds settings work with shards? Would they apply to each shard separately, or to the whole group?

When it comes to registry, I'm a bit confused that the proposal is immediately coupled to the supervisor. I'd expect to see a proposal for a standalone registry, which is independent of supervisors. I have mixed feeling on the fact that alias is provided to the start_child function. I'd rather leave this to the particular start_link function of the child process. That means a bit more typing in some cases, but it's also quite flexible and combines well with the existing idioms.

Unlike gproc which uses one global table, I'd propose to support multiple instances of the registry. That seems like it would give us a lot of freedom to organize processes based on specific needs. I'd expect in most cases one registry per app would suffice. For some cases multiple registries might be useful, for example to support ad-hoc sharding of registrations. Moreover, having the ability to create some registries deeper in the supervision tree, allows us to control their lifecycle.

When it comes to features, the following subset of gproc are the ones I'd like to see:

rich unique process registration
non-unique registrations
multiple unique registrations of a single process

In particular, the point 2 might replace the need for supporting which_children and friends for dynamic supervisors. If I want to do that for a particular supervisor, I could create a dedicated registry (or a couple of them if I want sharding), and register children under non-unique aliases. That should allow us to fetch the children and their counts when we need to.

I'd also expect that different ETS table types would work better for unique vs non-unique registrations. IIRC, gproc uses ordered_set for its single table, presumably to speed up various elaborate reads (which I personally rarely use and wonder if they are really needed). But if Elixir registry doesn't support such feature, perhaps it would make more sense to have two tables, or two types of registries (configurable via an option), or maybe even two different modules (e.g. UniqueRegistry and NonUniqueRegistry).

I'm a bit confused about the sharded registry paragraph. Why would I even want to lookup a shard, or start a child in a particular shard?

josevalim commented 8 years ago

Great questions Sasa!

All that being said, I wonder what's the purpose of supporting which_children and count_children for dynamic supervisors. The cases where I'd likely want to use that feature are in non-dynamic supervisors where I'd want to fetch a particular child of a particular supervisor.

Well, even if we debate if we need exactly those functions, if at any point we need to fetch all children pids in the dynamic supervisor, it would already impose all of the requirements we are talking about, and I think being able to fetch all children is a basic feature to have. Another way to put it, I would rather design with this requirement in mind because it is very likely we will need at least part of it in some basic form than assume we won't need this and have to redesign the whole thing later on. It is also worth remembering that features like which_children are necessary for hot-code upgrades.

Regarding the registry, I am a bit torn about providing a separate registry from the supervisor. That's because they share a lot of responsibilities. Both need to monitor the child process. Both would need to provide the sharding ability if we want them to scale. In terms of overhead, I would actually think having two separate processes will be more demanding both in terms of memory and CPU. If we remove the count_children, which_children and registry features from the supervisor, we get to remove a single field from supervisor state, which is a counter field of restarting processes. Everything else would stay the same. However, if we implement the registry inside the supervisor, the overhead is extra writes to and reads from the ETS table. Using separate processes, both would need to monitor or link, both would need to receive DOWN/EXIT messages, both would need to match those messages against their state, and so on.

Regarding the registry needs, I would call "non-unique registrations" to be part of a "process group" and not of a "process registry". Truth be told, if we want to build our own process registries and process groups, I would rather focus on their scalability and distribution characteristics rather than discuss their scope at the node level. Erlang ships with local, global and pg2 and this is more about providing an alternative to local. For replacing the others, we should discuss using Phoenix.Tracker (which is a process group) and distributed hash tables (to replace global). Maybe all three (local, global and pg2) could be efficiently tackled by a single entity but that's a question that will take quite some time to answer.

I'm a bit confused about the sharded registry paragraph. Why would I even want to lookup a shard, or start a child in a particular shard?

The idea is to bypass the "main supervisor" which otherwise would still be a bottleneck if all start_child calls go through a single process. By storing the shard in the registry, you can use consistent hashing to dispatch to specific shards and bypass this bottleneck. It is quite similar to what side_jobs and Phoenix.PubSub do.

Let's keep this discussion going, I still expect a lot of things to change, so now is the time. :)

fishcakez commented 8 years ago

Every registry that uses :via to register names can suffer from a race condition where the supervisor tries to restart the process but the name is still in the registery because the :DOWN messages has not yet been sent or has not been handled. We could try to fix this in OTP by removing the whereis_name lookup before spawning a process. The race condition occurs because link exit signals are sent before monitor down signals.

The same whereis_name lookup prevents non-unique registrations with :via.

Using a supervisor as the registry has the flaw that while a child blocks inside init/1 the pid of the child is not known. Therefore lookups will fail until init/1 returns. A similar feature to gproc's await could get around this potentially.

If the supervisor has the registry it is not a good idea to make blocking calls to the registry because if a child traps exits and the child calls the supervisor, the supervisor can not shutdown the child for the duration of the shutdown timeout due to deadlock of both waiting on each other to handle a message.

sasa1977 commented 8 years ago

Every registry that uses :via to register names can suffer from a race condition where the supervisor tries to restart the process but the name is still in the registery because the :DOWN messages has not yet been sent or has not been handled. We could try to fix this in OTP by removing the whereis_name lookup before spawning a process. The race condition occurs because link exit signals are sent before monitor down signals.

gproc uses :erlang.is_process_alive to circumvent that (see here). Do you see anything wrong with that approach?

fishcakez commented 8 years ago

Ah yeah. That avoids the race condition locally.

On Thursday, 7 April 2016, Saša Jurić notifications@github.com wrote:

Every registry that uses :via to register names can suffer from a race condition where the supervisor tries to restart the process but the name is still in the registery because the :DOWN messages has not yet been sent or has not been handled. We could try to fix this in OTP by removing the whereis_name lookup before spawning a process. The race condition occurs because link exit signals are sent before monitor down signals.

gproc uses :erlang.is_process_alive http://erlang.org/doc/man/erlang.html#is_process_alive-1 to circumvent that (see here https://github.com/uwiger/gproc/blob/master/src/gproc.erl#L1661). Do you see anything wrong with that approach?

— You are receiving this because you commented. Reply to this email directly or view it on GitHub https://github.com/elixir-lang/gen_broker/issues/10#issuecomment-206828142

sasa1977 commented 8 years ago

Well, even if we debate if we need exactly those functions, if at any point we need to fetch all children pids in the dynamic supervisor, it would already impose all of the requirements we are talking about, and I think being able to fetch all children is a basic feature to have. Another way to put it, I would rather design with this requirement in mind because it is very likely we will need at least part of it in some basic form than assume we won't need this and have to redesign the whole thing later on. It is also worth remembering that features like which_children are necessary for hot-code upgrades.

I understand some of these features are needed for internal operations, such as hot code upgrade, or killing all children for example. But I regard these as operations which are not likely to run frequently. The design based on ETS bookkeeping hints that it's envisioned many different clients will call these operations, possibly frequently. I wonder if this will really be the case for dynamic supervisors. By choosing to store something to ETS from multiple shards, you're potentially reducing the throughput of the frequently used operation (process creation) for something which is needed rarely (getting the list of all children).

Regarding the registry, I am a bit torn about providing a separate registry from the supervisor. That's because they share a lot of responsibilities.

They do, but I have doubts whether registry is needed only by the supervisor. A registry is a thing which registers processes, regardless of where they are created. Would a supervisor-registry even work with e.g. Phoenix Channels or Poolboy workers which are created and restarted directly by a custom GenServer? Also, would it work with non-dynamic supervisors?

In terms of overhead, I would actually think having two separate processes will be more demanding both in terms of memory and CPU.

If you're talking about registry process not running in the supervisor process, I don't think that should be a problem. As I said, I expect one registry will suffice in most apps, with maybe a handful of them in highly loaded apps for sharding. I'd be quite surprised if say 100 or 1000 registry processes would be needed.

Using separate processes, both would need to monitor or link, both would need to receive DOWN/EXIT messages, both would need to match those messages against their state, and so on.

They would, but they'd be doing two different things. One deals with fault-tolerance, another with keeping the collection of registered processes. The registry itself basically needs to issue :ets.delete_object, presumably on a set ets, so that should be quite fast. Also, I don't think registry should link. I believe monitor should be used.

Regarding the registry needs, I would call "non-unique registrations" to be part of a "process group" and not of a "process registry".

Excellent point! I would still like to see it though, because it's a frequent use-case for me. I use it with gproc instead of GenEvent.

Truth be told, if we want to build our own process registries and process groups, I would rather focus on their scalability and distribution characteristics rather than discuss their scope at the node level.

IMO, local registry is frequently needed, and it's a quite different thing from a distributed registry. I would certainly like to see various distributed registries with different guarantees on C vs A, but I don't think a distributed registry can substitute the local (or the other way around). I never used gproc for distributed registry, just like I wouldn't use global for the local one.

The idea is to bypass the "main supervisor" which otherwise would still be a bottleneck if all start_child calls go through a single process. By storing the shard in the registry, you can use consistent hashing to dispatch to specific shards and bypass this bottleneck.

I was expecting this would be done in the function. Couldn't start_child bypass the main supervisor process and go directly to some shard?

josevalim commented 8 years ago

By choosing to store something to ETS from multiple shards, you're potentially reducing the throughput of the frequently used operation (process creation) for something which is needed rarely (getting the list of all children).

That's just one of the possible implementations. We need to consider all of the features we want with all possible implementations, it is too early to rule features out based on one of the possible implementations.

They do, but I have doubts whether registry is needed only by the supervisor. A registry is a thing which registers processes, regardless of where they are created. Would a supervisor-registry even work with e.g. Phoenix Channels or Poolboy workers which are created and restarted directly by a custom GenServer? Also, would it work with non-dynamic supervisors?

Oh, it is definitely needed by more than a supervisor but I am not interested in them. :) For example, Phoenix Channels need a different kind of registry than a local one (hence Phoenix.Presence).

I think there is a mismatch on what the two of us are expecting from the registry. They both seem to be arbitrary, so I am not saying one of us is particularly right. I am currently focused on supporting only a small (but common) set of use cases. Partly because I don't think Elixir should solve all of them and partly because solving all of them is a hard task.

In terms of overhead, I would actually think having two separate processes will be more demanding both in terms of memory and CPU.

What I meant is that having a supervisor+registry in a single process is going to be more performant than having two separate processes. It is also one less abstraction to setup. Again, not a deciding factor, but just one of the many things to consider.

IMO, local registry is frequently needed, and it's a quite different thing from a distributed registry.

Not necessarily. For example, I could have a Registry module that works as a regular, local registry. However, once a flag is given, we specify it should synchronize with similarly named registries in other nodes. I could easily see Phoenix.Tracker providing those two mode of operations. So depending on the implementation, local or distributed could be a configurable aspect on top of the same API (specially if it is AP). In the long term, I am more interested in such options, because it means we can go from local to distributed in a straight-forward and well-defined way.

Maybe it means we should start with Registry and just make it local for now. But with the knowledge we have right now, I don't think we can design today a local registry that would scale to distributed later on.

I was expecting this would be done in the function. Couldn't start_child bypass the main supervisor process and go directly to some shard?

To do that, we would need to either name each of the shards (i.e. a registry) or store the registry names in a ETS table (still another registry). So I would rather do it explicitly via the registry, specially because you don't need a name or a registry to use the dynamic supervisor.

sasa1977 commented 8 years ago

I think there is a mismatch on what the two of us are expecting from the registry. They both seem to be arbitrary, so I am not saying one of us is particularly right. I am currently focused on supporting only a small (but common) set of use cases. Partly because I don't think Elixir should solve all of them and partly because solving all of them is a hard task.

Let's focus on this and keep the rest aside for the moment. My understanding of a local registry core boils down to:

Arbitrary rich process aliases can be given
It is expected many different client processes will write (register pids) and read (discover pids)

What do you have in mind?

josevalim commented 8 years ago

@sasa1977 did I answer those questions after our IRC convo yesterday? :)

sasa1977 commented 8 years ago

I think so, yes, but I wanted to see it here to make sure we don't talk past each other, and also to keep this thread understandable without needing to cross reference the irc history.

josevalim commented 8 years ago

A summary is that this feature plans to expose the IDs we already keep in the supervisor. Making it straight-forward to solve use cases when a supervisor needs to access a sibling without requiring us to use the local registry. However, for the dynamic supervisor, those IDs end-up being optional which makes it resemble even more like a registry while at the same time unifies both "static" and "dynamic" supervisor APIs (so both support IDs).

sasa1977 commented 8 years ago

Right, so you want to access children of a dynamical supervisor via ID, so you can for example terminate a particular child, or send it a message, right?

First, to gain consistent unification between supervisors, I assume all children of all supervisor should in fact have IDs. Otherwise, you still have a special case where children of dynamic supervisor are not identifiable unless explicit opt-in is provided. Perhaps that's fine though.

More importantly, dynamic supervisor is more likely to be more busy than a statical one, which usually starts a couple of predefined children and rarely does something else. So I think querying a dynamic supervisor process about its children is not a good default choice. Putting aside possible deadlock issues, that approach would not scale well. A busy supervisor will have hard time keeping up with requests for creating children, queries about particular children, and dealing with terminated children. Separating these concerns should make the system more efficient.

There's little doubt in my mind that ETS table is the best approach to store and query these children. It would work much faster and keep the related tasks out of the supervisor process. Assuming ETS table is used, the distinction between querying a supervisor about a child, or querying a general purpose registry about a registered process becomes quite small.

In your proposal, we're querying with:

location = {:via, DynamicSupervisor, {:id, supervisor_registry_name, child_id}}
GenServer.call(location, :perform_action)

Using generic registry it would be:

location = {:via, LocalRegistry, {registry_name, child_alias}}
GenServer.call(location, :perform_action)

Where child_alias is equivalent to {supervisor_registry_name, child_id} - a combination of the "scope id" and the child id (which is unique in that scope).

That seems quite similar, and technically boils down to pretty much the same thing. Registration is stored in some public ETS table which can later be queried by multiple clients.

The benefit of the generic registry is that we can register any process. In contrast, if I understand your proposal, I can only register children of the dynamic supervisor.

Another benefit of the generic registry is that as its client I have more freedom. Just because a child is started somewhere deeply in the hierarchy, doesn't mean I need a separate ETS table. I can just have one top-level registry if that performs well (which IME usually should). I can also create multiple registries if I want to divide the work in some special cases.

So to summarize, I see following benefits with general-purpose registry:

It allows us to register any process locally
It gives us complete freedom to divide the registration work.
It doesn't introduce new amount of work to a possibly busy dynamic supervisor
It doesn't change the existing conventions, where the child module decides about the process alias.

Just my 2 cents :-)

josevalim commented 8 years ago

That's a perfect summary. I pretty much agree with everything on it. One question: what are the cons of a general-purpose registry compared to the supervisor in this case? :)

sasa1977 commented 8 years ago

The most important one I can think of is what's been brought up by @fishcakez Basically a separate registry works separately from a supervisor, so inconsistent views are possible (e.g. a process is dead, but not yet deregistered). As I've mentioned - gproc seems to handle it properly, but it needs to check that the child is alive on every lookup.

I'm not at all concerned about resources usage though. As I said, I don't expect more than a few registries per app in most extreme cases, so the overhead shouldn't be big, and throughput might in fact be better since registry and supervisor run concurrently.

Nothing else comes to mind atm. Perhaps @fishcakez has some thoughts?

josevalim commented 8 years ago

@sasa1977 One of the pros for the supervisor-registry I have in mind is that baking it into the supervisor may also be conceptually simpler than setting up another process, specially for "static" supervisors. Which is something you mentioned in your comment, I just wanted to put it here explicitly. :)

fishcakez commented 8 years ago

When using a Supervisor based registry error messages sent to Logger (via :error_logger) will not contain the name of the process. Whereas using :via will mean the name is logged with the error instead of the pid.

Conceptually simpler has some merit because the registry will need to be placed in the appropriate place in the supervision tree. Otherwise if the registry process exits and the registry is lost, registered processes wont be registered anymore - but think they are - allowing duplication and other bugs.

Not only is it conceptually simpler but the message passing is much simpler. A Supervisor blocks during a start_link call and registration occurs during the start_link call. Therefore a busy supervisor is blocked by a register call to a :viaregistry. Whereas the :EXIT/ :DOWNs can be handled concurrently. I am unsure what the effect will be but splitting the process might not help performance.

sasa1977 commented 8 years ago

Conceptually simpler has some merit because the registry will need to be placed in the appropriate place in the supervision tree. Otherwise if the registry process exits and the registry is lost, registered processes wont be registered anymore - but think they are - allowing duplication and other bugs.

That is a valid point. A registry is not likely to crash due to a bug, since I expect it will do little besides removing entries on :DOWN messages. That said, it needs to be placed in a proper place in the supervision tree.

Again, most likely I'd use a single registry per app and start it first thing in the top-level supervisor. I think this issue could be resolved with documentation.

Therefore a busy supervisor is blocked by a register call to a :viaregistry.

This is indeed true and registration with via tuple or manual registration from init/1 will block the supervisor. However, as you say, :DOWN messages are handled concurrently so deregistration happens separately from the supervisor handling :EXIT messages. In times of increased terminations, this relieves some pressure from the supervisor process.

josevalim commented 8 years ago

Thanks for the discussion folks. We won't bundle a registry inside a supervisor but it is still a pity we can't rely on the supervisor IDs in order to find siblings.

josevalim commented 8 years ago

A new proposal without sharding and without a registry was added to #12.

ghost commented 6 years ago

This looks similar to https://github.com/jquadrin/ahab

josevalim commented 6 years ago

@jquadrin glad to know we have reached similar designs, even without collaborating, this means the chances we are both right are higher. :)

josevalim commented 6 years ago

Seeing this pull request actually sent me back to memory lane. :)

Here are some nice tidbits for those who are not familar with the story behind GenStage:

Before we proposed GenStage, we were working on a solution called GenRouter. The first time I talked publicly about it was in April 2015, at ElixirConf EU.
Throughout the year, we developed the idea and built some prototypes. We have outlined all design criteria in a document. The final version of document dates back to January 2016 and it already referenced an implementation of a TCPAcceptor and a scalable Supervisor.
A smaller document, which I think was made available to everyone at the time, also discussed GenRouter and GenRouter.Supervisor back in September 2015.
Meanwhile, we figured out GenRouter was not a good design because source (producer), sink (consumer) and dispatcher were all separate processes. GenStage merged all of them into a single abstraction and it was born as a prototype still in 2015.
This issue in particular is an adaptation of the earlier GenRouter.Supervisor ideas, now modified to fit GenStage. IIRC the main inspiration was Basho's side_jobs.
In January 2017 we decided that GenStage would not be part of core. The DynamicSupervisor in GenStage was renamed to ConsumerSupervisor, which is its current name, and a DynamicSupervisor was added to Elixir v1.6, released January 2018.

It is also nice to see that back here we were already looking for solutions for dynamic lookup of processes and that was the Registry added to Elixir v1.4, back in January 2017.

Note the earlier proposals also mentioned a TCPAcceptor built on top of GenRouter/GenStage but one was never implemented on top of those abstractions. It would be interesting to see how it works out in case someone decides to try it out. :)

EDIT: I was reviewing the slides for ElixirConf 2016 and we even had GenBroker (!!!) at some point. Just later on it became the dispatcher that is embedded directly inside GenStage.