PlatformLab / HomaModule

A Linux kernel module that implements the Homa transport protocol.
175 stars 43 forks source link

handle MSG_DONTWAIT #61

Closed breakertt closed 3 months ago

breakertt commented 3 months ago

fix MSG_DONTWAIT not handled when HomaModule ported to Linux 6.0 (e8a3664)

breakertt commented 3 months ago

I would also like to mention that with this fix itself is not enough to make Homa running with io_uring. Behavior of io_uring from my observation is 1. issue a call with MSG_DONTWAIT 2. if EAGAIN returned, then a kernel thread will issue a blocking call. However, with the current Homa code, the msg_control copy back and compensation for ___sys_recvmsg will break it since a second call will be issued without going through __sys_recvmsg, making compensation change the msg->control when io_uring issue second blocking call.

One potential fix would be to remove current "compensation for ___sys_recvmsg" code and ask userspace apps to always set msg_controllen correctly before calling recvmsg.

johnousterhout commented 3 months ago

This PR is now in my private repo and will get pushed to GitHub in a bit. I'm also comfortable with your suggested change to the "compensation"; are you in a position to verify that it really does allow io_uring to work with Homa?

breakertt commented 3 months ago

This PR is now in my private repo and will get pushed to GitHub in a bit. I'm also comfortable with your suggested change to the "compensation";

Hi for the io_uring thing, All signal_pendings also needs to be changed to task_sigpending as well like this https://lore.kernel.org/netdev/50310b5e-7642-4ca1-a9e1-6d817d472131@kernel.dk/T/.

are you in a position to verify that it really does allow io_uring to work with Homa?

io_uring try to call recvmsg with MSG_DONTWAIT first, if EAGAIN is returned then this syscall will be handled by a kernel thread with blocking call.

io_uring "works" even with the current lastest HomaModule, but it will stuck at io_uring_submit since MSG_DONTWAIT doesn't work -> this leads to the MSG_DONTWAIT handle

then the "compensation for _sysrecvmsg" problem -> fixed by removing the "compensation for sys_recvmsg", as a result, userspace apps have to set hdr.msg_controllen before every time we call recvmsg

and io_uring will use TIF_NOTIFY_SIGNAL in the kernel which gives a false positive when calling signal_pending so recvmsg will return EINTR without actual signal from userspace -> can fixed by change signal_pending to task_sigpending as advised from https://lore.kernel.org/netdev/50310b5e-7642-4ca1-a9e1-6d817d472131@kernel.dk/T/. I am not 100% confident about this one, but it works.

I have code using HomaModule with io_uring but after these three fixes, performance seems a bit weird. I may try to find some time to give you a demo for io_uring use for HomaMoudle.

The rationale for me to try to use io_uring with HomaModule is I expect it can make use of receive piping since kernel workers can issue blocking calls.

breakertt commented 3 months ago

Looking forward to your talk on Thursday as well, I will only be able to join remotely, unfortunately.

johnousterhout commented 3 months ago

OK, I'm going to take out the "compensation". I don't think it was the right way to handle this problem in the first place (it's not consistent with other usage of this field); better to tell apps that they can't depend on msg_controllen being unchanged after a call.