Closed DavidGOrtega closed 2 years ago
As per https://github.com/golang/go/issues/15113#issuecomment-283218133, the 2 second timeout we had in place should be enough, but it apparently isn't. 😅
See also https://github.com/golang/go/issues/15113 → https://github.com/gravitational/teleport/issues/1153 → https://github.com/gravitational/teleport/pull/1152 for a possible solution.
We already have a solution in place
As per golang/go#15113 (comment), the 2 second timeout we had in place should be enough, but it apparently isn't. 😅
See also golang/go#15113 → gravitational/teleport#1153 → gravitational/teleport#1152 for a possible solution.
https://github.com/golang/go/issues/21941#issuecomment-346141968
From the client's perspective, one could wrap ssh.Dial in a goroutine with a buffered channel and have an app-specific timeout. We actually like this approach much better,
Sorry! I missed that commit! 🙈 🚀
@0x2b3bfa0 lets merge and release?
This has been a tricky PR. After fixing the restart to be able to access the GPU due to kernel upgrade we were hitting a very funny error seen here and here hanging the ssh connection on DIAL and never escaping back, hence the resource timing out. To solve it
I had to put the logs function within a timed-out function.we added Teleport's fix mentioned on https://github.com/iterative/terraform-provider-iterative/pull/607#issuecomment-1159793257instance_gpu
to setup NVIDIA. Uses ubuntu-drivers for GPU auto detectionRelated: #606