Open hjoliver opened 4 years ago
[update] 2021-08
Platforms work done, there is still the requirement for two interfaces:
So there is scope for simplification, I started looking into this, however, it gets messy and I chickened out in order to get higher priority work done.
Here's how I think the user-facing interfaces could look:
And here's how the cylc.flow.hostuserutil
module could be re-written:
https://github.com/oliver-sanders/cylc-flow/blob/dns/cylc/flow/network/hostname.py
Propose bumping to 8.x and addressing when the time/demand allows.
Bumped to 8.x
This has recently been flagged again in https://github.com/cylc/cylc-flow/issues/6005, https://github.com/cylc/cylc-flow/issues/6004
By default, Cylc uses server FQDN's to identify servers, we rely on these FQDN's being 100% consistent across the network.
I.E. if a host self-identifies as abc.def it should also be identified as abc.def from any other host on the network. Whilst this might be reasonable assumption and true at most of the major Cylc sites, sadly, it is not always the case. HPC networking can be a tad eccentric and those using the HPC platform might have no control over its setup.
For examples of how inconsistent DNS setups can be, even on simple platforms, see this issue: https://github.com/cylc/cylc-flow/issues/3595
This can be exacerbated by the Python socket interfaces potentially changing behaviour between builds. This is part of why hostname -f
may differ from socket.get_fqdn
on different platforms.
We have made multiple attempts to come up with an approach that works for everyone but, sadly, we have failed.
I suggest that we should re-write the hostuserutil module that provides Cylc's DNS functionality so that the base methods that are used to identify servers are user configurable. We should also take the opportunity to review the use of FQDN host names across Cylc to see if there is anything we can do to loosen the requirement for fully consistent DNS.
In theory, there's no reason why we couldn't provide a solution that useshostname -f
to determine the FQDN (but cache the result to avoid repeat calls of course).
Frustratingly, I actually did have a branch that did this once, but bailed on it as being too high a risk for Cylc 8.0.0 as the new behaviour will not be exactly the same as the old due to the re-jigging of interfaces.
~I'll see what I can dig out.~ - https://github.com/oliver-sanders/cylc-flow/commits/dns/
Description below pasted verbatim from Element chat room (@oliver-sanders please edit if desired).
See also comments on #3766 and #3595 which this issue supersedes.
See also:
4981
3766
3595
5411
4296
TLDR;
Full post: