Closed sinban04 closed 1 year ago
I have no idea the error occurred above, But i suppose serviceManager port collides
user@ubuntu-22.04:~/injung/riak-riak-3.2.0$ sudo ./rel/riak2/bin/riak console
Exec: /home/user/injung/riak-riak-3.2.0/rel/riak2/erts-13.2.2.2/bin/erlexec -boot /home/user/injung/riak-riak-3.2.0/rel/riak2/releases/3.2.0/start -mode embedded -boot_var SYSTEM_LIB_DIR /home/user/injung/riak-riak-3.2.0/rel/riak2/lib -config /home/user/injung/riak-riak-3.2.0/rel/riak2/sys.config -args_file /home/user/injung/riak-riak-3.2.0/rel/riak2/vm.args -- console
Root: /home/user/injung/riak-riak-3.2.0/rel/riak2
/home/user/injung/riak-riak-3.2.0/rel/riak2
!!!!
!!!! WARNING: ulimit -n is 1024; 65536 is the recommended minimum.
!!!!
Erlang/OTP 25 [erts-13.2.2.2] [source] [64-bit] [smp:40:40] [ds:40:40:10] [async-threads:64] [jit:ns]
{"Kernel pid terminated",application_controller,"{application_start_failure,riak_repl,{{shutdown,{failed_to_start_child,riak_core_cluster_mgr_sup,{shutdown,{failed_to_start_child,riak_core_service_mgr,{{badmatch,{error,eaddrinuse}},[{riak_core_service_mgr,start_dispatcher,3,[{file,\"/home/user/injung/riak-riak-3.2.0/_build/default/lib/riak_repl/src/riak_core_service_mgr.erl\"},{line,513}]},{riak_core_service_mgr,init,1,[{file,\"/home/user/injung/riak-riak-3.2.0/_build/default/lib/riak_repl/src/riak_core_service_mgr.erl\"},{line,174}]},{gen_server,init_it,2,[{file,\"gen_server.erl\"},{line,851}]},{gen_server,init_it,6,[{file,\"gen_server.erl\"},{line,814}]},{proc_lib,init_p_do_apply,3,[{file,\"proc_lib.erl\"},{line,240}]}]}}}}},{riak_repl_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,riak_repl,{{shutdown,{failed_to_start_child,riak_core_cluster_mgr_sup,{shutdown,{failed_to_start_child,riak_core_service_mgr,{{badmatch,{error,eaddrinuse}},[{riak_core_service_mgr,start_dispatcher,3,[{file,"/home/user/injung/riak-riak-3.2.0/_build/default/lib/riak_repl/src/riak_core_service_mgr.erl"},{line,513}]},{riak_core_service_mgr,init,1,[{file,"/home/user/injung/riak-riak-3.2.0/_build/default/lib/riak_repl/src/riak_core_service_mgr.erl"},{line,174}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,851}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,814}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}}}}},{riak_repl_app,start,[normal,[]]}}})
mdc.cluster_manager = 127.0.0.1:11014
What error do you get? I see in your previous comment you had ulimit issues. There is guidance to fixing this in the docs.
@martinsumner Thank you for your reply :) Issue regarding ulimit was just a warning, so it's not big deal. What i'm going through is
When i start riak cluster (one node) after another it dies with such logs
{"Kernel pid terminated",application_controller,"{application_start_failure,riak_repl,{{shutdown,{failed_to_start_child,riak_core_cluster_mgr_sup,{shutdown,{failed_to_start_child,riak_core_service_mgr,{{badmatch,{error,eaddrinuse}},[{riak_core_service_mgr,start_dispatcher,3,[{file,\"/home/user/injung/riak-riak-3.2.0/_build/default/lib/riak_repl/src/riak_core_service_mgr.erl\"},{line,513}]},{riak_core_service_mgr,init,1,[{file,\"/home/user/injung/riak-riak-3.2.0/_build/default/lib/riak_repl/src/riak_core_service_mgr.erl\"},{line,174}]},{gen_server,init_it,2,[{file,\"gen_server.erl\"},{line,851}]},{gen_server,init_it,6,[{file,\"gen_server.erl\"},{line,814}]},{proc_lib,init_p_do_apply,3,[{file,\"proc_lib.erl\"},{line,240}]}]}}}}},{riak_repl_app,start,[normal,[]]}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,riak_repl,{{shutdown,{failed_to_start_child,riak_core_cluster_mgr_sup,{shutdown,{failed_to_start_child,riak_core_service_mgr,{{badmatch,{error,eaddrinuse}},[{riak_core_service_mgr,start_dispatcher,3,[{file,"/home/user/injung/riak-riak-3.2.0/_build/default/lib/riak_repl/src/riak_core_service_mgr.erl"},{line,513}]},{riak_core_service_mgr,init,1,[{file,"/home/user/injung/riak-riak-3.2.0/_build/default/lib/riak_repl/src/riak_core_service_mgr.erl"},{line,174}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,851}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,814}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,240}]}]}}}}},{riak_repl_app,start,[normal,[]]}}})
I followed the guide: Running Multiple nodes on one host and
Copied rel/
to rel1
, rel2
, rel3
And (changed configs: ports) when i tried to run one by one,
$ sudo ./rel1/riak/bin/riak console
$ sudo ./rel2/riak/bin/riak console
$ sudo ./rel3/riak/bin/riak console
When i run rel2
, it returns Kernel error
Set your handoff port and your Protocol Buffers or HTTP port (depending on which interface you are using) to different values on each node. For example:
What else i should do more beside the configs on the guide ??
# For Protocol Buffers:
listener.protobuf.internal = 127.0.0.1:8187
# For HTTP:
listener.http.internal = 127.0.0.1:8198
# For either interface:
handoff.port = 8199
The eaddrinuse
error seems to indicate there is something else which needs a new port. I'm busy at the moment, but will try and look at what this might be later. I suspect the mdc.cluster_manager (which is part of the riak_repl
application which is complaining in the log) needs a different port - have you changed that as well?
@martinsumner
I'm busy at the moment, but will try and look at what this might be later.
Oh sure. It's not urgent, so take your time I saw the comments in another github issue that this opensource is not maintained by Riak anymore and it's maintained by a few developers
And at first i changed three configs
nodename = riak1@127.0.0.1
...
# For Protocol Buffers:
listener.protobuf.internal = 127.0.0.1:8187
# For HTTP:
listener.http.internal = 127.0.0.1:8198
# For either interface:
handoff.port = 8199
and then i tried mdc.cluster_manager too (https://github.com/basho/riak/issues/1142#issuecomment-1677251576) But still i have no idea
BTW, when i vimdiff dev/dev1/riak/etc/riak.config
, dev/dev2/riak/etc/riak.config
nodename = dev1@127.0.0.1
...
listener.protobuf.internal = 127.0.0.1:10017
...
listener.http.internal = 127.0.0.1:10018
...
mdc.cluster_manager = 127.0.0.1:10016
there were 4 changes
Double-check that you don't have an advanced.config file in etc/. It may be that the setting of cluster_mgr is being overridden there ... I can't remember if we still produce the advanced.config in 3.2. If you do, just remove advanced.config and try again.
@martinsumner
Thank you for your advice
I followed your advice and remove advanced.config
in etc directory.
It seems that addrinuse
disappeared but, it raises another problem with some schema file
{"Kernel pid terminated",application_controller,"{application_start_failure,riak_core,{bad_return,{{riak_core_app,start,[normal,[]]},{'EXIT',{{badmatch,{error,schema_files_not_found}},[{riak_core_cli_registry,load_schema,0,[{file,\"/home/cc/riak320/riak-riak-3.2.0/_build/default/lib/riak_core/src/riak_core_cli_registry.erl\"},{line,53}]},{riak_core_app,start,2,[{file,\"/home/cc/riak320/riak-riak-3.2.0/_build/default/lib/riak_core/src/riak_core_app.erl\"},{line,106}]},{application_master,start_it_old,4,[{file,\"application_master.erl\"},{line,293}]}]}}}}}"}
Kernel pid terminated (application_controller) ({application_start_failure,riak_core,{bad_return,{{riak_core_app,start,[normal,[]]},{'EXIT',{{badmatch,{error,schema_files_not_found}},[{riak_core_cli_registry,load_schema,0,[{file,"/home/cc/riak320/riak-riak-3.2.0/_build/default/lib/riak_core/src/riak_core_cli_registry.erl"},{line,53}]},{riak_core_app,start,2,[{file,"/home/cc/riak320/riak-riak-3.2.0/_build/default/lib/riak_core/src/riak_core_app.erl"},{line,106}]},{application_master,start_it_old,4,[{file,"application_master.erl"},{line,293}]}]}}}}})
FYI, there's some config file in rel/vars.config
Would it be concerned ?
%% -*- mode: erlang;erlang-indent-level: 4;indent-tabs-mode: nil -*-
%% ex: ft=erlang ts=4 sw=4 et
{rel_vsn, "{{release_version}}"}.
%% Platform-specific installation paths
{platform_base_dir, "${RIAK_PATH:-$RELEASE_ROOT_DIR}"}.
{platform_bin_dir, "./bin"}.
{platform_data_dir, "./data"}.
{platform_etc_dir, "./etc"}.
{platform_lib_dir, "./lib"}.
{platform_log_dir, "./log"}.
{platform_gen_dir, "."}.
{platform_patch_dir, "./lib/patches"}.
%%
%% etc/app.config
%%
{web_ip, "127.0.0.1"}.
{web_port, 8098}.
{cluster_manager_ip, "127.0.0.1"}.
{cluster_manager_port, 9080}.
{handoff_port, 8099}.
{handoff_ip, "0.0.0.0"}.
{pb_ip, "127.0.0.1"}.
{pb_port, 8087}.
{storage_backend, "bitcask"}.
{sasl_error_log, "{{platform_log_dir}}/sasl-error.log"}.
{sasl_log_dir, "{{platform_log_dir}}/sasl"}.
{repl_data_root, "{{platform_data_dir}}/riak_repl"}.
{logger_level, info}.
%%
%% etc/vm.args
%%
{node, "riak@127.0.0.1"}.
{crash_dump, "{{platform_log_dir}}/erl_crash.dump"}.
%%
%% bin/riak
%%
%% relocatable releases don't call the launcher script, because
%% launcher script requires a riak user to exist.
%%{pid_dir, "$PLATFORM_BASE_DIR/var/run/riak"}.
%%
%% cuttlefish
%%
{cuttlefish, "on"}.
{cuttlefish_conf, "riak.conf"}.
%% {yz_solr_port, 8093}.
%% {yz_solr_jmx_port, 8985}.
Perhaps removing the advanced.config file altogether was not the best hack. Try and remove the specific line in the riak_core stanza that refers to the cluster manager:
i.e.
{riak_core,
[
%% The cluster manager will listen for connections from remote
%% clusters on this ip and port. Every node runs one cluster
%% manager, but only the cluster manager running on the
%% cluster_leader will service requests. This can change as nodes
%% enter and leave the cluster.
%% {cluster_mgr, {"127.0.0.1", 10016 } },
{schema_dirs, ["./share/schema"]}
]},
@martinsumner
Ah due to your advice, "schema" error disappeared, but "eaddrinuse" comes up again. (on a second node)
(changed cluster_mgr port 10016
to my custom port number)
It seems there's somewhere else raising this problem
It seems that you're kinda busy and this is out of hand for now because there is short of hands
It's okay it's not solvable for now, i can do the test with dev
I just wanted to check if i did something wrong misunderstanding the guide.
In my opinion, there's some config Riak Binaries are referencing collides such advance.config
as you said
Not sure what it is, but we should figure it out.
I'll dig it out as well when i'm available. Thank you for your reply :)
BTW, is there a tool by which we can follow the callpath (or stackstrace) running Riak ?
I saw the log/console.log
files, but it would be better if i can follow the callpath
@martinsumner
Good News !
It worked after i add handoff.port
field as well
handoff.port = 15116
To wrap up,
Change those values below in etc/riak.config
nodename = riak1@127.0.0.1
...
# For Protocol Buffers:
listener.protobuf.internal = 127.0.0.1:8187
# For HTTP:
listener.http.internal = 127.0.0.1:8198
# For either interface:
handoff.port = 8199
and cluster_mgr port in etc/advance.config
{riak_core,
[
%% The cluster manager will listen for connections from remote
%% clusters on this ip and port. Every node runs one cluster
%% manager, but only the cluster manager running on the
%% cluster_leader will service requests. This can change as nodes
%% enter and leave the cluster.
%% {cluster_mgr, {"127.0.0.1", 10016 } },
{schema_dirs, ["./share/schema"]}
]},
Thank you martin 👍
@sinban04
Thank you for bringing this up. The docs for this section need to be re-written. For reference, if you instead run make devrel
then it should spawn you 8 or 10 fully functional nodes (depending on Riak version) under the dev directory. In doing this, all nodes should have their ports adjusted automatically to avoid port collisions. The handoff port, I am not sure about, as I haven't done this for a while, so it may need to be added manually but all the others should work without the need to change any settings.
@Bob-The-Marauder
Hello Nicholas, thank you for the tip
Fortunately i've discovered "dev version build" on README.md (due to your documentation)
So i've already built up cluster w/ dev/dev<N>
I just wanted to make it clear that it would work well following your guide (v3.2.0).
Thank you for your contribution on this project :)
@sinban04 Thank you very much for the confirmation. Did you need to do anything with the handoff port or was that handled automatically as well?
@Bob-The-Marauder (cc. @martinsumner )
Ah I set "mdc.cluser_manager port" in advance.config
(in /rel/riak/etc)
and set "nodename", "listener.protobuf.internal", "listener.http.internal" in riak.config
(in /rel/riak/etc)
and add "handoff.port" in riak.config
(No value existed before)
It was no need to change mdc.cluster_manager = 127.0.0.1:9080
in riak.config
It seems advance.config
overrides the config values in riak.config
(as Martin hinted)
then, I run each cluster node one by one and then join them into one cluster (ring) following the guide (I used console to see some possible error logs)
$ sudo ./rel7/riak/bin/riak console
$ sudo ./rel8/riak/bin/riak console
$ sudo ./rel9/riak/bin/riak console
$ sudo su
# ./rel8/riak/bin/riak admin cluster join rel7@127.0.0.1
# ./rel9/riak/bin/riak admin cluster join rel7@127.0.0.1
# ./rel8/riak/bin/riak admin cluster plan
# ./rel8/riak/bin/riak admin cluster commit
and it's working well
# ./rel7/riak/bin/riak admin member-status
================================= Membership ==================================
Status Ring Pending Node
-------------------------------------------------------------------------------
valid 67.2% 34.4% riak7@127.0.0.1
valid 20.3% 32.8% riak8@127.0.0.1
valid 12.5% 32.8% riak9@127.0.0.1
-------------------------------------------------------------------------------
Valid:3 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
ok
Thank you for the feedback. We have made updates to our internal copy of the documentation already and this will be made live in the next public facing update to the docs we make.
Hello i'm trying to build multi-nodes on a single host following the link. https://www.tiot.jp/riak-docs/riak/kv/3.2.0/using/running-a-cluster/#running-multiple-nodes-on-one-host
I built
riak
from a source following the guide using OTP 25 (https://github.com/basho/riak/issues/1131, https://github.com/basho/riak/issues/1136)And it works well when i run a first node with the binary
and then when i try to run second one It returns an error
It seems that some configuration collides, so the threads die.
https://www.tiot.jp/riak-docs/riak/kv/3.2.0/using/running-a-cluster/#running-multiple-nodes-on-one-host I changed the three configuration values following the guide
at first, i changed the first two, and then added
handoff.port
is there anything i should change more ? Could you define the list of configurations (must be different from node to node) ? I saw an issue (https://github.com/basho/riak/issues/863), but still it's not solved.Thank you