google / gvisor

Application Kernel for Containers
https://gvisor.dev
Apache License 2.0
15.8k stars 1.3k forks source link

Infiniband support #10906

Open TheQuantumFractal opened 1 month ago

TheQuantumFractal commented 1 month ago

Description

I'm looking to do RDMA within gVisor containers and was curious if you support Infiniband or if this would be on the roadmap? Thanks!

Is this feature related to a specific bug?

No response

Do you have a specific solution in mind?

No response

kevinGC commented 1 month ago

There's no specific support for Infiniband. Can you help me understand what support would be needed? gVisor containers typically communicate through a virtual device (often veth). On a machine with an Infiniband NIC, packets would switch from veth to NIC without issue as far as I understand.

I don't know much about RDMA, but there's no special support for it in gVisor. I'm not sure whether it's needed, or whether having the underlying host support it is enough.

ekzhang commented 1 month ago

Hi @kevinGC, we think it would involve supporting the Infiniband verbs in libibverbs, which are operations that let you send and receieve data while bypassing the kernel networking stack.

There is a device called /dev/infiniband/uverbs0 but none of us are familiar with the internals yet unfortunately.

We've seen FreeFlow (https://github.com/Microsoft/Freeflow) from Microsoft and would be looking for something similar to maximize throughput.

kevinGC commented 1 month ago

Having looked (maybe too) quickly at verbs, it should be possible to support if my understanding is correct. Thoughts:

While the path to implementation seems reasonably clear, this is a significant chunk of work. The implementer would need to understand Infiniband verbs. I think we'd accept a PR for it, but for now it's not on the roadmap.

ekzhang commented 1 month ago

Sounds good, thank you for sharing your thoughts on the tractability of this!