Closed GoogleCodeExporter closed 8 years ago
looks as if there still is a running erlang node called 'boot' which is why the
command to get a node's name from erlang fails - try another node name, e.g.
erl -noinput -name bootxy -eval "io:format(\"~p~n\", [node()]), halt()."
(also you should possibly stop the boot node, e.g. ./bin/scalarisctl boot stop)
scalarisctl is supposed to return with no output - to get an interactive shell,
call it like this:
./bin/scalarisctl -i boot start
Nonetheless, we currently only support nodes with FQDNs as there are some
implications in case a node is known by different names, e.g. locally the node
is boot@localhost and remotely it is boot@xx.zz. Connections from other nodes
via distributed erlang only work if the two nodes share the same information
about themselves.
Original comment by nico.kru...@googlemail.com
on 18 Aug 2010 at 2:48
It would be very, very helpful if 'scalarisctl boot start' would give some
feedback, even when there are no problems. Simply printing something like
"Server 'boot' started successfully" would help enormously. And when there is a
problem, including a server that is already running, it would help enormously
if the software would say so.
Note that I do not want an interactive shell, I simply want some indication
that my command has succeeded or failed. If it is important in some cases to
have an absolutely silent startup, I suggest you add support for a '-q' flag
that switches off the messages.
Regarding the FQDN issue, to my amazement there actually was a problem on my
Ubuntu machine. I've managed to fix it by explicitly setting a domainname in
/etc/hosts, but would be very helpful if the Scalaris FAQ would explain the
issues much more clearly than it does now, and perhaps even suggest some fixes.
It would also help when the scalarisctl checkinstallation command would be a
bit more talkative. I still have no idea what the magic incantation 'erl
-noinput -name boot -eval "io:format(\"~p~n\", [node()]), halt()."' is supposed
to do, but perhaps the checkinstallation command could do it by default, and
give a clearer diagnosis text instead of the (for me) totally unreadable output
I quoted above.
Original comment by Kees.van...@gmail.com
on 19 Aug 2010 at 11:10
Unfortunately erlang doesn't provide us with a proper exit code when we execute
it in "detached" mode, i.e. non-interactive. Also even from the existence of a
running scalaris node (boot, or ordinary node) you cannot see whether the
processes are actually working - there could be an exception killing them after
which they would be re-started by one of the supervisors.
The only safe way to check whether a node is up and running is to check the log
files (which I have recently improved in rev1013) or run an interactive shell
(you will see the same output as in the log file) or check in the boot node's
web interface but even there diagnosing such rogue nodes is not easy.
Original comment by nico.kru...@googlemail.com
on 19 Aug 2010 at 5:10
You are right the following output is not helping at all:
However, when I do that I get:
---------------------
reeuwijk@babylon:~/lab/scalaris-front$ erl -noinput -name boot -eval
"io:format(\"~p~n\", [node()]), halt()."
{error_logger,{{2010,8,18},{16,25,37}},"Protocol: ~p: register error:
~p~n",["inet_tcp",{{badmatch,{error,duplicate_name}},[{inet_tcp_dist,listen,1},{
net_kernel,start_protos,4},{net_kernel,start_protos,3},{net_kernel,init_node,2},
{net_kernel,init,1},{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}]}
{error_logger,{{2010,8,18},{16,25,37}},crash_report,[[{initial_call,{net_kernel,
init,['Argument__1']}},{pid,<0.20.0>},{registered_name,[]},{error_info,{exit,{er
ror,badarg},[{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}},{ancestors,[
net_sup,kernel_sup,<0.9.0>]},{messages,[]},{links,[#Port<0.64>,<0.17.0>]},{dicti
onary,[{longnames,true}]},{trap_exit,true},{status,running},{heap_size,377},{sta
ck_size,24},{reductions,453}],[]]}
{error_logger,{{2010,8,18},{16,25,37}},supervisor_report,[{supervisor,{local,net
_sup}},{errorContext,start_error},{reason,{'EXIT',nodistribution}},{offender,[{p
id,undefined},{name,net_kernel},{mfa,{net_kernel,start_link,[[boot,longnames]]}}
,{restart_type,permanent},{shutdown,2000},{child_type,worker}]}]}
{error_logger,{{2010,8,18},{16,25,37}},supervisor_report,[{supervisor,{local,ker
nel_sup}},{errorContext,start_error},{reason,shutdown},{offender,[{pid,undefined
},{name,net_sup},{mfa,{erl_distribution,start_link,[]}},{restart_type,permanent}
,{shutdown,infinity},{child_type,supervisor}]}]}
{error_logger,{{2010,8,18},{16,25,37}},std_info,[{application,kernel},{exited,{s
hutdown,{kernel,start,[normal,[]]}}},{type,permanent}]}
{"Kernel pid
terminated",application_controller,"{application_start_failure,kernel,{shutdown,
{kernel,start,[normal,[]]}}}"}
Crash dump was written to: erl_crash.dump
Kernel pid terminated (application_controller)
({application_start_failure,kernel,{shutdown,{kernel,start,[normal,[]]}}})
---------------------
However, this output is created even before any code of ours is run. The output
comes from the Erlang runtime environment. I will add a test to
checkinstallation to see whether you are already running a Scalaris node.
Original comment by schu...@gmail.com
on 20 Aug 2010 at 7:31
With the client wrapper script java-api/scalaris using an erlang-provided node
name, this should be fixed for good - at least as long as there is an erl
executable on the same node. In all other setups, the Java-API should/can not
connect to localhost since a scalaris node can not start without erlang.
Also I adapted the FAQs a bit.
see changes in r1078 - r1083
Original comment by nico.kru...@googlemail.com
on 1 Sep 2010 at 3:30
Original issue reported on code.google.com by
Kees.van...@gmail.com
on 18 Aug 2010 at 2:34