bitwalker / distillery

Simplify deployments in Elixir with OTP releases!
MIT License
2.96k stars 397 forks source link

ping command is crashing because of race conditions #645

Closed AndrewDryga closed 5 years ago

AndrewDryga commented 5 years ago

We are using ping command as a Kubernetes liveness probe and it shows itself as extremely not stable:

  1. Because it uses static names it would collide when intervals are too frequent:

    {"log":"Protocol 'inet_tcp': the name talkinto-domain_maint_@10-16-0-4.default.pod.cluster.local seems to be in use by another Erlang node\n","logging.googleapis.com/sourceLocation":{"file":null,"line":0,"function":null},"severity":"INFO","time":"2019-03-10T23:13:46.804Z"} ▸ Could not start distribution: {{:shutdown, {:failed_to_start_child, :net_kernel, {:EXIT, :nodistribution}}}, {:child, :undefined, :net_sup_dynamic, {:erl_distribution, :start_link, [[:"talkinto-domain_maint_@10-16-0-4.default.pod.cluster.local", :longnames], false]}, :permanent, 1000, :supervisor, [:erl_distribution]}}
  2. They can race for sys.config file because it looks like ping executes all config providers which overrides config.sys file:

    {"could not start kernel pid",application_controller,"error in config file \"/opt/talkinto_domain/var/sys.config\" (none): no ending <dot> found"}
    Unable to configure release! could not start kernel pid (application_controller) (error in config file "/opt/talkinto_domain/var/sys.config" (none): no ending <dot> found)
    
    Crash dump is being written to: erl_crash.dump...done
  3. When used during VM boot (as readiness probe) it would crash the VM because of similar races with VM that runs production code.

bitwalker commented 5 years ago

As of 2.0.14, ping and others no longer run the config providers. You can also write a custom command that uses release_ctl ping with your own config parameters for name/cookie, and then use that to do the probe.

AndrewDryga commented 4 years ago

@bitwalker Just to make sure, current ping is still using maint suffix so race because of a name is possible? Do you have an example somewhere how to build a custom ping command?