d2iq-archive / dcos-flink-service

11 stars 17 forks source link

Support Mesos DNS #29

Closed EronWright closed 1 year ago

EronWright commented 7 years ago

Problem Flink 1.3 has improved support for Mesos DNS.

  1. The JM's hostname is configurable to use a Mesos DNS name.
  2. The TM's hostname is configurable to use the Mesos task's DNS name.
  3. Support for a bootstrap command was added, e.g. to block on DNS name resolution as performed by ./bootstrap tool (ref).

The relevant commits to Flink are: https://github.com/apache/flink/commit/1e53b75e7df039dd45e7497a353163319ffa6182 https://github.com/apache/flink/commit/d7364fffbf552aed79e537a7aec3af593cb4e159

Good reasons to use Mesos DNS:

  1. to support environments where the agent hostname isn't resolvable (since Flink by default uses InetAddress.getLocalHost()).

Suggested Fix

  1. Install the bootstrap program into the container image.
  2. Use the bootstrap program at JM startup by adjusting the Marathon cmd.
  3. Use the bootstrap program at TM startup by adjusting the Flink option: mesos.resourcemanager.tasks.bootstrap-cmd
  4. Define a VIP for the JM RPC endpoint.
  5. Configure the JM hostname using a dynamic property: -Djobmanager.rpc.address=rpc.$DCOS_SERVICE_NAME.marathon.l4lb.thisdcos.directory
  6. Configure the TM hostname using a dynamic property: -Dmesos.resourcemanager.tasks.hostname=_TASK_.$DCOS_SERVICE_NAME.mesos Note that Flink itself replaces the _TASK_ component at runtime.
mqasimsarfraz commented 7 years ago

Is there any update on this? it would be great if JM is reachable via VIP.