vmware-tanzu / crash-diagnostics

Crash-Diagnostics (Crashd) is a tool to help investigate, analyze, and troubleshoot unresponsive or crashed Kubernetes clusters.
Other
182 stars 43 forks source link

SSH connections don't timeout when setting `conn_timeout` in `ssh_config` #196

Open tanayk2610 opened 3 years ago

tanayk2610 commented 3 years ago

I'm using crashd v0.3.2 and SSH connections don't seem to timeout:

time="2021-01-29T13:19:38-08:00" level=debug msg="run: executing command on 192.168.128.3 using ssh: [sudo /home/vmware-system-user/tkc_get_cluster_logs.sh 0]"
time="2021-01-29T13:19:38-08:00" level=debug msg="ssh.run: /usr/bin/ssh -q -o StrictHostKeyChecking=no -i tkc-key -p 22 vmware-system-user@192.168.128.3 \"sudo /home/vmware-system-user/tkc_get_cluster_logs.sh 0\""

Here's the ssh_config for it:

ssh=ssh_config(
    username='vmware-system-user',
    private_key_path=args.sshkey,
    max_retries=1,
    conn_timeout=30,
)

In the source code the conn_timeout does get set to the passed in the argument though: https://github.com/vmware-tanzu/crash-diagnostics/blob/3e9c3f5f9b6009858724cb3e0aa02e4a0ab1ddf3/starlark/ssh_config.go#L79

If conn_timeout is not passed in, it still should default to 30 but that also doesn't seem to happening: https://github.com/vmware-tanzu/crash-diagnostics/blob/3e9c3f5f9b6009858724cb3e0aa02e4a0ab1ddf3/starlark/ssh_config.go#L57

vladimirvivien commented 3 years ago

@tanayk2610 thanks for the issue. You are correct, the timeout is not applied and there should be a default. Marking this as a bug since the parameter is advertised, but it's not currently applied.

MadhavJivrajani commented 3 years ago

Hey, I'd like to take a crack at this 😄

vladimirvivien commented 3 years ago

@MadhavJivrajani Go for it ! 🎉