Open hjoliver opened 2 years ago
QUESTIONS
ssh -o stricthostkeychecking=no
or provide it as a site-configured option, for all ssh connections?
.ssh/known_hosts
filestricthostkeychecking=no
to your .ssh/config
(security people won't like this either, but it's up to users!)(Adding to 8.x for now, but it would be good to sort this out quickly as it can affect new user uptake).
should Cylc automatically use ssh -o stricthostkeychecking=no
Cylc does not set stricthostkeychecking
explicitly so it is relying on the user's SSH config. The SSH command is configurable so sites can configure this in the Cylc config.
IMO the current behaviour is correct (stricthostkeychecking by default, can be configured otherwise).
If agreed we should turn this into a documentation issue and make sure this is properly covered in the platform configuration section.
Agreed, moving to cylc-doc...
The same issue can occur when cylc play
puts a scheduler on a run host.
We need to document that the ssh
command used for that comes from the localhost
platform settings.
This is a problem many new users run into, in my experience. It's not a "Cylc problem" as such, but Cylc depends critically on ssh, so we should at least document the issue and how to handle it.
When you ssh to another host, the host has to appear in a central or user
known_hosts
file, or else the connection will fail (BatchMode=yes
) with "Host key verification failed", or (BatchMode=no
) you have to respond to an interactive prompt to add the host to your user.ssh/know_hosts
file.I think clusters typically (always?) have a central
known_hosts
file that avoids this issue for cluster-internal ssh connections. But users will still run into the problem, with Cylc, if they have to:(At NIWA, this applies between user desktop machines and HPC platforms; and between several somewhat-distinct HPC clusters).
Cylc doesn't obfuscate the problem for scheduler start-up (although many users still have no idea what it means):
For job platforms, debugging would be a big ask for a new user.
Scheduler log:
Job activity file:
And finally in the job err file (this location is a bit surprising, since the job was never submitted):