axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.86k stars 402 forks source link

when do hot plug with the drive, some commands in the last will be completed by io_uring successful, but the drive doesn't receive the commands. #1273

Closed suxinggm closed 3 days ago

suxinggm commented 4 days ago

kernel version: 5.19.17 liburing version: 2.6

I am using liburing to run IO with sudden power loss(hot plug). After power on I will compare the data with last completed commands.

I found that some commands in the last(when power loss happen), it will return cqe->res=0(succeed). But when I use pcie trace to catch the commands, seems the commands does not send to the drive. I suspect this is the io_uring's bug?

Does anyone hit this kind of issue before? Thanks.

axboe commented 4 days ago

What commands are these? cqe->res equaling 0 would normally not be success for a read/write operation, as they should return the bytes read or written.

But if io_uring posts the completion, it got the callback from the lower layers, and is passing on that result. Hence it is very unlikely that this is an io_uring issue.

suxinggm commented 4 days ago

@axboe I am using nvme passthrough mode, so the cqe->res = 0 means success.

Yes, I am also believe that the cqe comes back from the lower layers, but I add the printk in nvme driver and also with pcie trace, I found that a few completed commands(write) in io_uring_cqe are not appearing in the driver and pcie trace. That is very strange to me.

axboe commented 4 days ago

I am using nvme passthrough mode, so the cqe->res = 0 means success.

Ah that makes sense. At least that explains that part, but it doesn't change my opinion that this is very unlikely to be a core io_uring issue. It might be an issue in the nvme driver, assuming that is your target. But I can't really say much more on the topic without seeing your code. It could also be an issue there.

suxinggm commented 4 days ago

@axboe Could you provide me the email. I will send some of my code to you for help. Thanks a lot.

axboe commented 4 days ago

It's in the git log, in fact all over it :-)

axboe@kernel.dk

Be sure to include how you are running it.

suxinggm commented 4 days ago

@axboe , I have already sent out the email to you. Please check. Thanks.

axboe commented 3 days ago

I got it, but it's a huge blob, it's missing a header, and there's no explanation included.

Since it's so big, I don't have time to go over this in detail. Are you able to modify the kernel you are running on? Would make sense to simply add some more tracepoints, eg in nvme you can have trace points that include the issue and completion of a request, and include the ->user_data field in those. With that, you should be able to tie the stack together, see where it goes wrong.

As mentioned earlier, it's extremely unlikely that this is an io_uring issue. io_uring just posts completions based on callbacks when IO has completed. Without those callbacks, it's not posting completions. I'd be the most suspicious of the nvme passthrough code.

Since 5.19 was very early days for the nvme passthrough code, I'd also strongly suggest you upgrade to eg 6.11 for your testing rather than rely on a kernel that is 2.5 years old. In case it is a passthrough issue, it'll likely make things much better for you.

axboe commented 3 days ago

I'll close this one for now, feel free to re-open if needed when more information is available.

suxinggm commented 3 days ago

@axboe Thank you very much. I will try to follow your suggestion and find the root cause.