bitwalker / distillery

Simplify deployments in Elixir with OTP releases!
MIT License
2.96k stars 397 forks source link

Some CLI commands break for application release with TLS encrypted distributed erlang node configuration #640

Open eliotk opened 5 years ago

eliotk commented 5 years ago

Steps to reproduce

-kernel proto_dist inet_tls
-kernel ssl_dist_opt server_verify verify_peer
-kernel ssl_dist_opt client_verify verify_peer
-kernel ssl_dist_opt client_certfile "/path/certfile.pem"
-kernel ssl_dist_opt server_certfile "/path/certfile.pem"
-kernel ssl_dist_opt client_cacertfile "/path/certfile.pem"
-kernel ssl_dist_opt server_fail_if_no_peer_cert true

Interestingly, ./bin/app_name attach works and initializes a usable iex session. Once that session is established, it's possible to verify that the clustering communication is working between two launched application nodes (confirming tls configuration is working correctly).

If you remove just the TLS configuration lines from vm.args, the app starts successfully and then those commands work again.

Perhaps it has something to do w/ how the cli commands are being run outside of the full application context?

What I've also tried is to set the same TLS configuration variables using the ELIXIR_ERL_OPTIONS env var which is used for erlang runtime options in distillery's elixir call when running the commands (https://github.com/bitwalker/distillery/blob/49d88194ad5f100239fa146bb2c649988f6b399f/priv/libexec/erts.sh#L169). I've set that env variable like this, mirroring the vm.args TLS config:

ELIXIR_ERL_OPTIONS="-proto_dist inet_tls \
-ssl_dist_opt client_certfile /path/certfile.pem \
-ssl_dist_opt server_certfile /path/certfile.pem \
-ssl_dist_opt client_cacertfile /path/certfile.pem \
-ssl_dist_opt client_verify verify_peer \
-ssl_dist_opt server_verify verify_peer \
-ssl_dist_opt server_fail_if_no_peer_cert true"

export ELIXIR_ERL_OPTIONS

You can see when calling ping w/ DEBUG_BOOT=true that those flags are properly passed into the erlang call:

++1551363621 ERL='-noshell -s elixir start_cli  -logger handle_sasl_reports false'
++1551363621 erl -proto_dist inet_tls -ssl_dist_opt client_certfile /home/app_name/shared/certs/app_name_client_certfile.pem -ssl_dist_opt server_certfile /home/app_name/shared/certs/app_name_server_certfile.pem -ssl_dist_opt client_cacertfile /home/app_name/shared/certs/app_name_cacert.pem -ssl_dist_opt client_verify verify_peer -ssl_dist_opt server_verify verify_peer -noshell -s elixir start_cli -logger handle_sasl_reports false -extra -e Mix.Releases.Runtime.Control.main --logger-sasl-reports false -- ping --name=app_name@host '--cookie=[cookie]'
+++1551363621 whereis_erts_bin
+++1551363621 '[' -z 10.2.1 ']'
+++1551363621 '[' -z '' ']'
+++1551363621 __erts_dir=/home/app_name/releases/20190222201327/erts-10.2.1
+++1551363621 '[' -d /home/app_name/releases/20190222201327/erts-10.2.1 ']'
+++1551363621 echo /home/app_name/releases/20190222201327/erts-10.2.1/bin
++1551363621 __bin=/home/app_name/releases/20190222201327/erts-10.2.1/bin
++1551363621 '[' -z /home/app_name/releases/20190222201327/erts-10.2.1/bin ']'
++1551363621 __erl=/home/app_name/releases/20190222201327/erts-10.2.1/bin/erl
++1551363621 __boot_provided=0
++1551363621 grep '\-boot '
++1551363621 echo -proto_dist inet_tls -ssl_dist_opt client_certfile /home/app_name/shared/certs/app_name_client_certfile.pem -ssl_dist_opt server_certfile /home/app_name/shared/certs/app_name_server_certfile.pem -ssl_dist_opt client_cacertfile /home/app_name/shared/certs/app_name_cacert.pem -ssl_dist_opt client_verify verify_peer -ssl_dist_opt server_verify verify_peer -noshell -s elixir start_cli -logger handle_sasl_reports false -extra -e Mix.Releases.Runtime.Control.main --logger-sasl-reports false -- ping --name=app_name@host '--cookie=[cookie]'
++1551363621 __erts_included=0
++1551363621 [[ /home/app_name/releases/20190222201327/erts-10.2.1/bin/erl =~ ^/home/app_name/releases/20190222201327 ]]
++1551363621 __erts_included=1
++1551363621 '[' 1 -eq 1 ']'
++1551363621 '[' 0 -eq 1 ']'
++1551363621 '[' 1 -eq 1 ']'
++1551363621 /home/app_name/releases/20190222201327/erts-10.2.1/bin/erl -boot_var ERTS_LIB_DIR /home/app_name/releases/20190222201327/lib -boot /home/app_name/releases/20190222201327/bin/start_clean -config /home/app_name/releases/20190222201327/var/sys.config -proto_dist inet_tls -ssl_dist_opt client_certfile /home/app_name/shared/certs/app_name_client_certfile.pem -ssl_dist_opt server_certfile /home/app_name/shared/certs/app_name_server_certfile.pem -ssl_dist_opt client_cacertfile /home/app_name/shared/certs/app_name_cacert.pem -ssl_dist_opt client_verify verify_peer -ssl_dist_opt server_verify verify_peer -noshell -s elixir start_cli -logger handle_sasl_reports false -extra -e Mix.Releases.Runtime.Control.main --logger-sasl-reports false -- ping --name=app_name@host '--cookie=[cookie]'
▸  Received 'pang' from app_name@host!
▸  Possible reasons for this include:
▸    - The cookie is mismatched between us and the target node
▸    - We cannot establish a remote connection to the node

Description of issue

2.0.12

OS: CentOS Erlang: 10.2.3 (OTP 21.2.2) Elixir: 1.8.0

use Mix.Config

# Configures the endpoint
config :app_name, app_nameWeb.Endpoint,
  url: [host: "localhost"],
  secret_key_base: "[secret]",
  render_errors: [view: app_nameWeb.ErrorView, accepts: ~w(html json)],
  pubsub: [name: app_name.PubSub, adapter: Phoenix.PubSub.PG2]

# Configures Elixir's Logger
config :logger, :console,
  format: "$time $metadata[$level] $message\n",
  metadata: [:request_id]

# tell logger to load a LoggerFileBackend processes
config :logger,
  backends: [{LoggerFileBackend, :file_log}, :console, {LoggerFileBackend, :logstash_log}]

config :logger, utc_log: true

config :logger,
  backends: [{LoggerFileBackend, :file_log}, :console, {LoggerFileBackend, :logstash_log}]

config :logger, :file_log,
  path: "log/app_name.log",
  format: "$time $metadata[$level] $message\n",
  metadata: [:request_id]

config :logger, :console,
  format: "$time $metadata[$level] $message\n",
  metadata: [:request_id]

config :logger, :logstash_log,
  level: :info,
  path: "log/logstash.log",
  format: {IoraLogging.Formatter, :format},
  metadata: [
    :request_id,
    :host,
    :method,
    :path,
    :status,
    :filtered_params,
    :state,
    :duration,
    :ip,
    :format,
    :controller,
    :action,
    :tags,
    :user_agent,
    :user,
    :access_token
  ]

# Use Jason for JSON parsing in Phoenix
config :phoenix, :json_library, Jason

config :phoenix, :filter_parameters, ["password", "access_token"]

# Import environment specific config. This must remain at the bottom
# of this file so it overrides the configuration defined above.
import_config "#{Mix.env()}.exs"

Portion of vm.args that breaks those commands:

-kernel proto_dist inet_tls
-kernel ssl_dist_opt server_verify verify_peer
-kernel ssl_dist_opt client_verify verify_peer
-kernel ssl_dist_opt client_certfile "/path/certfile.pem"
-kernel ssl_dist_opt server_certfile "/path/certfile.pem"
-kernel ssl_dist_opt client_cacertfile "/path/certfile.pem"
-kernel ssl_dist_opt server_fail_if_no_peer_cert true

Thanks for any and all thoughts and help w/ this!

bitwalker commented 5 years ago

Yeah, this is because many of the commands don't use the vm.args file directly, because it doesn't apply to them but only the running node (which is why start/foreground/console work), attach only works because it bypasses networking entirely and connects to via domain socket. We would need to specifically handle the TLS config vars from vm.args and pass them as extra options to erl/erlexec.