moby / hyperkit

A toolkit for embedding hypervisor capabilities in your application
BSD 2-Clause "Simplified" License
3.61k stars 327 forks source link

Consider using kqueue in hypervisor socket implementation #211

Open djs55 opened 6 years ago

djs55 commented 6 years ago

The core of hyperkit uses kqueue via mevent.c but pci_virtio_sock.c uses plain old select.

If hyperkit is used together with vpnkit in Docker for Mac and large numbers (> 1024) of connections are port forwarded then hyperkit becomes unable to process any more AF_VSOCK connections due to the accepted file descriptor being greater than FD_SETSIZE, see for example https://github.com/moby/hyperkit/blob/3ace9850121a2ef270e0309a3ff6c2f991357842/src/lib/pci_virtio_sock.c#L1364

This manifests as errors under load, for example

$ docker ps
Error response from daemon: Bad response from Docker engine

from https://github.com/docker/for-mac/issues/2841

This scalability limit could be removed by switching from select to kqueue (or poll) in pci_virtio_sock.c.

imavroukakis commented 6 years ago

Hi team, any updates on this would be appreciated

ijc commented 6 years ago

At one point we were trying to avoid MacOS-isms in an attempt to keep the option of using Hyperkit on other platforms open in the future. I think we've basically given up on that so a PR to switch to kqueue should be fine.

I had thought that kqueue was callback based, making it a big rewrite to integrate with the current code structure, but it seems that I was mistaken and it should be possible to replace select with kqueue without a lot of non-localised changes.

Have we considered whether multiplexing in vpnkit might be easier though?

imavroukakis commented 6 years ago

Has there been any more discussion around this ? Would love to see it fixed 😄

ijc commented 6 years ago

AFAIK nobody has worked on this. PRs are welcome.

imavroukakis commented 6 years ago

@ijc thanks for the update. I would happily have given this a go but C is not something I'm anywhere near competent in.