Closed mzealey closed 1 month ago
As described in https://www.erlang.org/doc/system/distributed.html#nodes
A node is an executing Erlang runtime system that has been given a name, using the command-line flag -name (long names) or -sname (short names). The format of the node name is an atom name@host. name is the name given by the user. host is the full host name if long names are used, or the first part of the host name if short names are used.
A smaller reproduction example:
In summary, some OTP functions allow to provide only the user part of the nodename (for example the -name and -sname command-line arguments), but other functions require the full nodename (for example the shell r command, and the net_kernel:connect_node function
no hostname possibility is allowed, and actually it's very useful to us:
As explained by the ejabberdctl.cfg documentation, and erlang documentation, and this experiment: the erlang node name always contains the host part, and some tools allow to provide only the user part, then those tools add the host part.
In other words, even if you configure only ERLANG_NODE=ejabberd
, the actual node name is ejabberd@machinename. And the actual node name must be provided when calling net_adm:connect_node
.
The obvious solution would be to check in ejabberdctl if ERLANG_NODE has just user part, in that case add the host part, to ensure all the user cases will work correctly
no hostname possibility is allowed, and actually it's very useful to us
Is it useful because that allows you to use the same configuration file in several machines which have different machine names? In that case, the obvious solution should work for you too, right?
Example patch:
diff --git a/ejabberdctl.template b/ejabberdctl.template
index 83ec7e1bd..21be6430f 100755
--- a/ejabberdctl.template
+++ b/ejabberdctl.template
@@ -66,6 +66,7 @@ done
# shellcheck source=ejabberdctl.cfg.example
[ -f "$EJABBERDCTL_CONFIG_PATH" ] && . "$EJABBERDCTL_CONFIG_PATH"
[ -n "$ERLANG_NODE_ARG" ] && ERLANG_NODE="$ERLANG_NODE_ARG"
+[ "$ERLANG_NODE" = "${ERLANG_NODE%@*}" ] && ERLANG_NODE="$ERLANG_NODE@$(hostname -s)"
[ "$ERLANG_NODE" = "${ERLANG_NODE%.*}" ] && S="-s"
: "${SPOOL_DIR:="{{spool_dir}}"}"
: "${EJABBERD_LOG_PATH:="$LOGS_DIR/ejabberd.log"}"
Yes exactly, this sounds like it would work, however because I don't know ejabberdctl
script in much depth I'm not sure if modifying $ERLANG_NODE
in this way in the script would produce other issues later on (ie should be scoped to a new variable specifically for net_adm:connect_node
) or if it is ok to do globally
Commit fa12301e085562962fc865e72ad9361ba41fcb7d added the new
-sname undefined
way of starting up the ejabberdctl module in OTP23+, however it appears to be causing us issues.Running latest master on OTP26,
ejabberdctl
script cannot connect to the node ifERLANG_NODE=ejabberd
, however it works ifERLANG_NODE=ejabberd@ejabberd
. This is being run in a local container where the hostname is set toejabberd
.It appears that in the failing case,
-eval "net_kernel:connect_node('ejabberd')"
is being run and failing silently.A minimal reproducible test case has the following working:
But the following (which is roughly what happens when
ERLANG_NODE=ejabberd
) fails:The docs for the file clearly say that this no hostname possibility is allowed, and actually it's very useful to us: