to support environments where the agent hostname isn't resolvable (since Flink by default uses InetAddress.getLocalHost()).
Suggested Fix
Install the bootstrap program into the container image.
Use the bootstrap program at JM startup by adjusting the Marathon cmd.
Use the bootstrap program at TM startup by adjusting the Flink option:
mesos.resourcemanager.tasks.bootstrap-cmd
Define a VIP for the JM RPC endpoint.
Configure the JM hostname using a dynamic property:
-Djobmanager.rpc.address=rpc.$DCOS_SERVICE_NAME.marathon.l4lb.thisdcos.directory
Configure the TM hostname using a dynamic property:
-Dmesos.resourcemanager.tasks.hostname=_TASK_.$DCOS_SERVICE_NAME.mesos
Note that Flink itself replaces the _TASK_ component at runtime.
Problem Flink 1.3 has improved support for Mesos DNS.
./bootstrap
tool (ref).The relevant commits to Flink are: https://github.com/apache/flink/commit/1e53b75e7df039dd45e7497a353163319ffa6182 https://github.com/apache/flink/commit/d7364fffbf552aed79e537a7aec3af593cb4e159
Good reasons to use Mesos DNS:
InetAddress.getLocalHost()
).Suggested Fix
mesos.resourcemanager.tasks.bootstrap-cmd
-Djobmanager.rpc.address=rpc.$DCOS_SERVICE_NAME.marathon.l4lb.thisdcos.directory
-Dmesos.resourcemanager.tasks.hostname=_TASK_.$DCOS_SERVICE_NAME.mesos
Note that Flink itself replaces the_TASK_
component at runtime.