hauleth / consulate

Erlang port mapper module that uses Consul instead of EPMD
Apache License 2.0
13 stars 2 forks source link

problems with inets on startup #2

Open ruslandoga opened 3 years ago

ruslandoga commented 3 years ago

Hi!

I've followed the steps outlined in elixirforum post but unfortunately I couldn't get the app to start in s distributed setting with consulate.


If I use a command similar to the one on the forum,

ERL_FLAGS="-epmd_module consulate -start_epmd false" iex --sname foo -S mix phx.server

I get consulate:start_link/0 undefined error:

... snip ...
{"Kernel pid terminated",application_controller,"{application_start_failure,kernel,{{shutdown,{failed_to_start_child,net_sup,{shutdown,{failed_to_start_child,consulate,{'EXIT',{undef,[{consulate,start_link,[],[]}
... snip ...

even though I've already ran mix deps.compile and I have _build/dev/lib/consulate/ebin available.


And when using -pa option as well with path to consulate ebin, the error becomes different:

ERL_FLAGS="-epmd_module consulate -start_epmd false -pa _build/dev/lib/consulate/ebin" iex --sname foo -S mix phx.server
Protocol 'inet_tcp': not supported

which seems to be coming from https://github.com/erlang/otp/blob/81c6d94625a04f6550e77a068489ca6378cabfee/lib/kernel/src/net_kernel.erl#L1811, which seems like a problem with loading inet_tcp / inet_tcp_dist modules.


I wonder if you could give any hints / pointers for where I could look to resolve these problems?

Repro: https://github.com/ruslandoga/try-consulate Relevant commit: https://github.com/ruslandoga/try-consulate/commit/dc517893425456e0e7277c06b0aee80cdbe74cc5

Versions used:

Erlang/OTP 24 [erts-12.0.3] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [dtrace]

Elixir 1.12.2 (compiled with Erlang/OTP 24)
ruslandoga commented 3 years ago

When starting a node not in a distributed setting, I'm able to interact with consulate no problem:

> iex -S mix
Erlang/OTP 24 [erts-12.0.3] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [dtrace]

Compiling 1 file (.ex)
Interactive Elixir (1.12.2) - press Ctrl+C to exit (type h() ENTER for help)

iex(1)> :consulate.start_link
[debug] consulate:start_link()
{:ok, #PID<0.388.0>}

iex(2)> :consulate_client.list_local_services
[debug] consulate:get('/v1/agent/services') -> 'http://127.0.0.1:8500/v1/agent/services?'
{:ok,
 %{
   "_nomad-client-s6pxrvnmilluqd6pbwhivrfgoxn6y2bp" => %{
     "Address" => "127.0.0.1",
     "Datacenter" => "dc1",
# ..etc

iex(3)> :consulate.register_node("asdf", 1234)
[debug] consulate:register_node("asdf", 1234)
[debug] consulate:put('/v1/agent/service/register', %{
  Check: %{
    DeregisterCriticalServiceAfter: "60s",
    Interval: "10s",
    Name: "Check Erlang Distribuition Port",
    Status: "passing",
    TCP: "localhost:1234"
  },
  EnableTagOverride: false,
  ID: "asdf",
  Meta: %{},
  Name: "erlang-service",
  Port: 1234
}) -> 'http://127.0.0.1:8500/v1/agent/service/register?' @ "{\"Check\":{\"DeregisterCriticalServiceAfter\":\"60s\",\"Interval\":\"10s\",\"Name\":\"Check Erlang Distribuition Port\",\"Status\":\"passing\",\"TCP\":\"localhost:1234\"},\"EnableTagOverride\":false,\"ID\":\"asdf\",\"Meta\":{},\"Name\":\"erlang-service\",\"Port\":1234}"
{:ok, 1}
ruslandoga commented 3 years ago

When running the app with mix with -debug_info option, I see that consulate is indeed not loaded:

> ERL_FLAGS="-epmd_module consulate -start_epmd false -init_debug" iex --name foo -S mix phx.server

{progress,preloaded}
{progress,kernel_load_completed}
{progress,modules_loaded}
{start,heart}
{start,logger}
{start,application_controller}
{progress,init_kernel_started}
{apply,{application,load,[{application,stdlib,[{description,"ERTS  CXC 138 10"},{vsn,"3.15.1"},{id,[]},{modules,[array,base64,beam_lib,binary,c,calendar,dets,dets_server,dets_sup,dets_utils,dets_v9,dict,digraph,digraph_utils,edlin,edlin_expand,epp,eval_bits,erl_abstract_code,erl_anno,erl_bits,erl_compile,erl_error,erl_eval,erl_expand_records,erl_internal,erl_lint,erl_parse,erl_posix_msg,erl_pp,erl_scan,erl_stdlib_errors,erl_tar,error_logger_file_h,error_logger_tty_h,escript,ets,file_sorter,filelib,filename,gb_trees,gb_sets,gen,gen_event,gen_fsm,gen_server,gen_statem,io,io_lib,io_lib_format,io_lib_format_ryu_table,io_lib_fread,io_lib_pretty,lists,log_mf_h,maps,math,ms_transform,orddict,ordsets,otp_internal,pool,proc_lib,proplists,qlc,qlc_pt,queue,rand,random,re,sets,shell,shell_default,shell_docs,slave,sofs,string,supervisor,supervisor_bridge,sys,timer,unicode,unicode_util,uri_string,win32reg,zip]},{registered,[timer_server,rsh_starter,take_over_monitor,pool_master,dets]},{applications,[kernel]},{optional_applications,[]},{included_applications,[]},{env,[]},{maxT,infinity},{maxP,infinity}]}]}}
{progress,applications_loaded}
{apply,{application,start_boot,[kernel,permanent]}}
{apply,{application,start_boot,[stdlib,permanent]}}
=SUPERVISOR REPORT==== 18-Jul-2021::18:58:17.662650 ===
    supervisor: {local,net_sup}
    errorContext: start_error
    reason: {'EXIT',
                {undef,
                    [{consulate,start_link,[],[]},
... snip ...

When running in a release, I see that some of the required modules (inets, httpc, and consulate) are loaded, but I can't find inet_tcp in the list of loaded modules (or maybe they are always loaded with kernel?).

ruslandoga commented 3 years ago

If I add :inets to the list of :extra_applications in mix.exs I get another error also at {apply,{application,start_boot,[kernel,permanent]}} stage:

Protocol 'inet_tcp': register/listen error: {noproc,{gen_server,call,[httpc_manager,{request,{request,undefined,<0.821.0>,0,http,{"127.0.0.1",8500},"/v1/agent/service/c",[],get,{http_request_h,undefined,"keep-alive",undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,"127.0.0.1:8500",undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,undefined,[],undefined,undefined,undefined,undefined,"0",undefined,undefined,undefined,undefined,undefined,undefined,[]},{[],[]},{http_options,"HTTP/1.1",infinity,true,{essl,[]},undefined,false,infinity,false},"http://127.0.0.1:8500/v1/agent/service/c?",[],none,[],1626626575175,undefined,undefined,undefined,false}},infinity]}}

This seems to mean that httpc_manager is not running when we make the first request to consul.

hauleth commented 3 years ago

I will take a look. Thanks for detailed report.

baggers-br commented 1 year ago

It seems the module is started during kernel startup and so things like inets are not started at that time. Using application:ensure_all_started(inets) in init or even register_node will hang indefinitely, presumably due to this.