erlang / otp

Erlang/OTP
http://erlang.org
Apache License 2.0
11.36k stars 2.95k forks source link

ERL-860: erlang:register/2 only accepts atoms as keys #3832

Open OTP-Maintainer opened 5 years ago

OTP-Maintainer commented 5 years ago

Original reporter: essen Affected version: Not Specified Component: kernel Migrated from: https://bugs.erlang.org/browse/ERL-860


Old issue but still biting us to this day. It is currently more difficult than it should be to register names for processes started dynamically.

* The obvious solution to this problem is using list_to_atom which is dangerous because the atom table is limited.
* The typical solution is to use a third party library or write something yourself. Not great for what should be a core feature.
* The atom table limit could be improved by implementing EEP 20 but I am unclear as to whether GC atoms can be used as keys in the process registry (can we easily put GC-able terms in there?).
* Alternatively extending erlang:register/2 to also accept integers would allow us to generate names dynamically (for example using the result of erlang:phash2 as the key, or via counters). This doesn't sound too difficult to implement but would probably look a little weird in tools.
* Otherwise the old structured terms solution which involves copying the terms and will necessarily impact performance.

That said, while I can read in the gproc paper section 4.1 that performance is a concern, I wonder in what scenario it is. Since we can't really create names dynamically we end up with many systems that have very small number of registered processes. The performance concerns must be for a few specific use cases. If true, perhaps the best solution would be to have two registry storages, one for atoms and one for structured terms. We already test that the term is an atom[1] so this would add zero overhead in that case; it only would for the structured key cases.

I am also wondering whether there is a benefit to using the current registry data structure instead of reusing the ets code (at least in the case where structured keys would be allowed). The ets code has seen a lot more performance improvements than erlang:register/2 including read/write concurrency so maybe it would be an improvement to switch to an internal, always existing, ets table.

If erlang:register/2 starts allowing any term (because it's easier to not have restrictions, for example), it will at the very least have to reject pid() because that would make the ! syntax non-obvious (though ! could also just be restricted to atoms as well).

Opening first and foremost because I did not see it in the tickets, though I could be interested in implementing something depending on the complexity of the preferred solution or guidance opportunities.

[1] https://github.com/erlang/otp/blob/master/erts/emulator/beam/register.c#L178
OTP-Maintainer commented 5 years ago

kjnilsson said:

This would be very useful for systems such as Ra (https://github.com/rabbitmq/ra) where Ra servers running on different nodes need a persistent addressable identity to be reachable without further synchronisation (such as replicating a name to pid lookup table). Ra allows you to dynamically create clusters of processes and currently need to create atoms dynamically to do so.