Open TheQuantumFractal opened 1 month ago
There's no specific support for Infiniband. Can you help me understand what support would be needed? gVisor containers typically communicate through a virtual device (often veth). On a machine with an Infiniband NIC, packets would switch from veth to NIC without issue as far as I understand.
I don't know much about RDMA, but there's no special support for it in gVisor. I'm not sure whether it's needed, or whether having the underlying host support it is enough.
Hi @kevinGC, we think it would involve supporting the Infiniband verbs in libibverbs, which are operations that let you send and receieve data while bypassing the kernel networking stack.
There is a device called /dev/infiniband/uverbs0
but none of us are familiar with the internals yet unfortunately.
We've seen FreeFlow (https://github.com/Microsoft/Freeflow) from Microsoft and would be looking for something similar to maximize throughput.
Having looked (maybe too) quickly at verbs, it should be possible to support if my understanding is correct. Thoughts:
ioctl
s for their special character device. We can support this: we'd make our own virtual per-container/pod /dev/infinibad/uverbs0
that understands and safety-checks ioctl
s. We'd also have syscall filters specific to Infiniband (e.g. GPUs).While the path to implementation seems reasonably clear, this is a significant chunk of work. The implementer would need to understand Infiniband verbs. I think we'd accept a PR for it, but for now it's not on the roadmap.
Sounds good, thank you for sharing your thoughts on the tractability of this!
Description
I'm looking to do RDMA within gVisor containers and was curious if you support Infiniband or if this would be on the roadmap? Thanks!
Is this feature related to a specific bug?
No response
Do you have a specific solution in mind?
No response