Open wbrefvem opened 2 months ago
We generally have avoiding making the default experience suffer for supporting remote runtimes, because kind nominally targets local clusters and there are local options that don't have this sort of issue.
I would argue that mitigatin podman-remote's connection limit issues is a feature request, and having a low connection limit seems like a usability issue in this podman install.
I don't want to make log collection slower, and I'm not enthused about attempting to probe for the ssh connection limit.
Even if we did, what if some other process is running concurrently?
1) seems like the best approach, because even if kind tries really hard to deal with this, being limited to a few connections still risks things breaking if any other tool concurrently accesses it ... and this limit is not typical when working with container runtimes.
Something to explore: Is there a good reason to tightly limit the maximum number of sessions, or is this just an arbitrary default? Are we going to introduce problems by telling users to increase it a lot?
It's the default for sshd that podman doesn't touch AFAIK. As long as users aren't exposing their podman VMs to the Internet, I can't imagine any security implications. And with modern hardware I don't foresee any performance hit. The fact that it hasn't come up yet probably means that not many users are running multi-node clusters and then exporting logs, so the chance of a large number of users even trying it out seems low.
Tagential to #3729
kind export logs
results in ssh handshake errors when running under the podman remote client (i.e. podman on Windows and macOS). The easiest way to reproduce is to create a 4-node cluster with the podman provider enabled on either Windows or macOS and then runkind export logs
.On Windows (incl. WSL) and macOS, the podman client (podman-remote) works by sending an API request to a podman VM (or container on WSL) through an ssh tunnel. I noticed that the set of podman commands that would return errors varied between runs of
kind export logs
, and upon digging into it further, my working hypothesis is that because commands to collect logs are all run concurrently, it's possible to run up against the max ssh connections to the podman machine. (SpecificallyMaxSessions
andMaxStartups
in the sshd config, each with a default of 10.) I set the ssh config to allow 30 max connections for a cluster of 4 nodes and that seemed to fix it.As I see it there are two possible solutions: