processone / ejabberd

Robust, Ubiquitous and Massively Scalable Messaging Platform (XMPP, MQTT, SIP Server)
https://www.process-one.net/en/ejabberd/
Other
6.01k stars 1.5k forks source link

Setting INET_DIST_INTERFACE parameter in ejabberdctl.cfg makes ejabberd crash #4066

Closed adriaardila closed 10 months ago

adriaardila commented 12 months ago

Environment

Configuration: ejabberdctl.cfg

...
#.
#' INET_DIST_INTERFACE: IP address where this Erlang node listens other nodes
#
# This communication is used by ejabberdctl command line tool,
# and in a cluster of several ejabberd nodes.
#
# Default: 0.0.0.0
#
INET_DIST_INTERFACE=10.110.0.2
...

Errors from error.log/crash.log

[root@localhost ~]# ejabberdctl status
init terminating in do_boot ({cannot get bootfile,/opt/ejabberd-23.04/bin/start.boot})

Crash dump is being written to: erl_crash.dump...done
{"could not start kernel pid",application_controller,"{bad_environment_value,\"{\\"init\"}"}
=ERROR REPORT==== 19-Jul-2023::08:52:20.851165 ===
application_controller: unterminated string starting with "init": {"init

could not start kernel pid (application_controller) ({bad_environment_value,"{\"init"})

Crash dump is being written to: /opt/ejabberd/logs/erl_crash_20230719-085220.dump...done

Bug description

I was trying to setup clustering the same as I have been doing it for previous versions and when I set the INET_DIST_INTERFACE parameter in my ejabberdctl.cfg ejabberd crashes, I can't start it, or execute any command through ejabberdctl (systemctl start ejabberd doesn't work either). It doesn't matter what I set INET_DIST_INTERFACE value to, even if i uncomment the parameter and leave the default value (0.0.0.0) it crashes anyway, if I comment it back it works again. The error is always the same (bad_environment_value "init"). Not sure if it's a version specific error, I'm doing something wrong or maybe a problem with my distro.

Thanks for your help.

badlop commented 12 months ago

init terminating in do_boot ({cannot get bootfile,/opt/ejabberd-23.04/bin/start.boot})

Nice finding! Problem confirmed

It happens with all the ejabberd installers since 22.05, which are built with a new method. This method compiles ejabberd, generates a release directory with "make rel", then removes some unnecessary files, performs some customizations to ejabberdctl, and packages the result into .run, .deb and *.rpm installers.

To reproduce the problem:

  1. Install any recent ejabberd installer (run, deb or rpm)
  2. In ejabberdctl.cfg enable the INET_DIST_INTERFACE option and set any value
  3. Run ejabberd-23.04/bin/ejabberdctl, no need to provide any command
  4. Instead of showing the help, it crashes

Just to make sure where the problem is, I compiled ejabberd from source code, generated the release with "make rel", unpacked it, configured INET_DIST_INTERFACE, and ejabberdctl works perfectly... It seems some file or some customization applied in the installers produces this problem.

Quick solution so you can start using ejabberd right now: Edit the file ejabberd-23.04/bin/ejabberdctl. In line 80 it says:

    INET_DIST_INTERFACE2=$("$ERL" -noshell ...

Add $ERLANG_OPTS. That variable includes arguments to tell erlang where to find the bootfile. For example:

    INET_DIST_INTERFACE2=$("$ERL" $ERLANG_OPTS -noshell ...

As a definitive solution I wonder: should that line be changed in all the ejabberdctl scripts, or should the installers be modified to include whatever file is missing?

skrleo commented 10 months ago

INET_DIST_INTERFACE2=$("$ERL" $ERLANG_OPTS -noshell ...

This problem also occurred to me. After I changed it according to your method, another problem occurred.

~ $ bin/ejabberdctl reload_config
init terminating in do_boot (cannot expand $RELEASE_LIB in bootfile)

Crash dump is being written to: erl_crash.dump...done
{"could not start kernel pid",application_controller,"{bad_environment_value,\"{\\"init\"}"}
=ERROR REPORT==== 15-Sep-2023::01:03:31.349530 ===
application_controller: unterminated string starting with "init": {"init

could not start kernel pid (application_controller) ({bad_environment_value,"{\"init"})

Crash dump is being written to: /home/ejabberd/logs/erl_crash_20230915-010329.dump...done