bitwalker / distillery

Simplify deployments in Elixir with OTP releases!
MIT License
2.96k stars 398 forks source link

127.0.0.1 is illegal hostname for Erlang nodes #338

Closed chulkilee closed 6 years ago

chulkilee commented 6 years ago

Currently 127.0.0.1 is used as default host part of erlang long node name. (ref).

However, it seems like erlang does not accept ip address - instead it expects "hostname" which can be resolved via DNS. Erlang doc says:

host is the full host name if long names are used

As a result, you cannot connect to node using 127.0.0.1 as the host name. See the following output:

iex --remsh foo@127.0.0.1 --sname iex
2017-10-06 11:42:22 ** System NOT running to use fully qualified hostnames **~n** Hostname ~ts is illegal **~n
    "127.0.0.1"
Erlang/OTP 20 [erts-9.1.1] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]

=ERROR REPORT==== 6-Oct-2017::04:42:22 ===
** System NOT running to use fully qualified hostnames **
** Hostname 127.0.0.1 is illegal **
Could not contact remote node foo@127.0.0.1, reason: :nodedown. Aborting...

I understand custom vm.args allows users to set the value as needed. However, using node host working "out-of-box" would be nice - for example, run the releaese on the local dev environment.

Option 1: localhost

Then we can always use localhost on remote node's host

iex --remsh foo@localhost --sname iex

This option assumes localhost name points to localhost (127.0.0.1).

However, I believe expecting localhost to be 127.0.0.1 is pretty reasonable :)

Option2: use sname

# instead of this
# -name <%= release_name %>@127.0.0.1

# use shortname
- sname <%= release_name %>

In this case, to connect to the localhost without vm.args. customization, you have to figure out what "short" host name would be. For example, you can do this:

iex --remsh foo@`hostname` --sname iex
bitwalker commented 6 years ago

127.0.0.1 is absolutely allowed as a hostname (I have used it extensively myself, and running under Kubernetes all of the node names use IPs to reach individual pods when clustering). The problem in your situation is you were starting your node with --sname, which is why you see the error ** System NOT running to use fully qualified hostnames **. In your case, you simply needed to add --name iex@127.0.0.1 instead of --sname iex.

Long names need the hostname to be resolvable and reachable, in other words it needs to either be an IP address or a valid DNS name. In the case of an IP address, it is already resolved so it only needs to be reachable from the current host. In the case of a DNS name, it needs to be resolvable by any one of the DNS resolution mechanisms on the host (/etc/hosts, /etc/resolve.conf, etc.) to an IP address which is reachable from the current host.

chulkilee commented 6 years ago

Thanks for the quick response!

I finally got the reference: Erlang Reference Manual: 3 Concurrent Programming.

(Note: erl -sname assumes that all nodes are in the same IP domain and we can use only the first component of the IP address, if we want to use nodes in different domains we use -name instead, but then all IP address must be given in full.)

I thought sname would use short host name of local host, but it uses short host name of remote host if remsh is given. So it failed to detect short host name from 127.0.0.1..

bitwalker commented 6 years ago

In general I would only ever use --sname for nodes running the same host, for everything else, --name is better, it's explicit about what you are connecting to/where you can be found, and works pretty intuitively (i.e. --name foo@192.168.1.10 can be read foo at/on 192.168.1.10, where --sname foo is not always clear what hostname it will receive (on my mac it gets foo@aielman which is not the same as what other things see as my local hostname (usually aielman.local) whereas in a VM I may get foo@localhost in some cases or the machine name in others, depends on what the hostname is set to I guess). I never use --sname myself, I just don't see the benefit versus explicit long names.

thehunmonkgroup commented 6 years ago

Here's another stumbling point I figured out when debugging this issue. Turns out I had an old .erlang.cookie file sitting around in my home dir that had 644 perms. That resulted in this error in the nodetool escript:

/Users/hunmonk/Desktop/distillery-demo/_build/prod/rel/clock/erts-9.2/bin/escript /Users/hunmonk/Desktop/distillery-demo/_build/prod/rel/clock/bin/nodetool -name clock@localhost -setcookie supersecret ping

=ERROR REPORT==== 21-Feb-2018::17:49:12 ===
Cookie file /Users/hunmonk/.erlang.cookie must be accessible by owner onlyescript: exception error: no match of right hand side value 
                 {error,
                     {{shutdown,
                          {failed_to_start_child,auth,
                              {"Cookie file /Users/hunmonk/.erlang.cookie must be accessible by owner only",
                               [{auth,init_cookie,0,
                                    [{file,"auth.erl"},{line,286}]},
                                {auth,init,1,[{file,"auth.erl"},{line,140}]},
                                {gen_server,init_it,2,
                                    [{file,"gen_server.erl"},{line,365}]},
                                {gen_server,init_it,6,
                                    [{file,"gen_server.erl"},{line,333}]},
                                {proc_lib,init_p_do_apply,3,
                                    [{file,"proc_lib.erl"},{line,247}]}]}}},
                      {child,undefined,net_sup_dynamic,
                          {erl_distribution,start_link,
                              [[clock_maint_55241@localhost,longnames],false]},
                          permanent,1000,supervisor,
                          [erl_distribution]}}}

Deleting that file resolved the issue.