axboe / liburing

Library providing helpers for the Linux kernel io_uring support
MIT License
2.85k stars 402 forks source link

Feature request: getdents64 #111

Closed 3v1n0 closed 2 years ago

3v1n0 commented 4 years ago

It would be nice to be able to list directory contents in an async way as well

axboe commented 4 years ago

While that would be nice to have for applications, it's outside of the scope of io_uring and liburing to provide this kind of functionality. io_uring only deals with the kernel side of things, and liburing is a support library for that. getdentsX isn't a kernel API. The right place for something like this would be in a library utilizing liburing, not in liburing itself.

3v1n0 commented 4 years ago

Ah isn't that a syscall? I saw some discussions around about it being in liburing so I thought made sense to have an issue.

paxostv commented 4 years ago

I'll second this request, albeit its lower priority and opens the box for a bunch more directory operations. Perhaps the confusion is that readdir() is not a system call, but getdents64() is. Note normal reads on a directory file descriptor aren't permitted.

openalmeida commented 4 years ago

+1 #139

axboe commented 4 years ago

If someone writes a test case, I can wire this up. Should literally be a 5 min job.

YoSTEALTH commented 4 years ago

If someone writes a test case, I can wire this up. Should literally be a 5 min job.

If some C expert can write this, would help out.

Could really use this feature and https://github.com/axboe/liburing/issues/82, having no way to manage directory asynchronous is starting to show in application using liburing backend. Its like hitting a brick wall while running.

dkadashev commented 4 years ago

@axboe I wrote up a test case for this: https://gist.github.com/dkadashev/d8e4dcef8456d9e2f622b5ba66df0e53. The io_uring part of it is under #if 0, since I had to test with sync call instead. Let me know if it needs some fixes.

Also please let me know if you prefer a pull request. I did not go with that because the support is still not available in the kernel / liburing, so the test would be in the code base, but disabled (not in makefile?) - not sure if you like that idea or not.

I'll also write test cases for #82 - and probably will try to do the rest of the work there (kernel, liburing). Will take me a while probably, since I can't spend much time per day on this.

YoSTEALTH commented 3 years ago

@dkadashev thank you very much for creating these patches

@axboe has these patches been applied to io_uring and liburing?

tavianator commented 3 years ago

I've been following the discussions here a bit: https://www.spinics.net/lists/linux-fsdevel/msg189520.html

It would be helpful to support using the fd's internal offset rather than needing to specify it explicitly every time. A big benefit of that is it would let me do two successive getdents() in one operation with linked SQEs. Otherwise you have to do another syscall to check if you've hit the end of the directory, even though normally one buffer is enough.

axboe commented 3 years ago

I agree, both methods should be workable. Just like how we support current file position for reads/writes as well, if you don't give a specific offset.

YoSTEALTH commented 3 years ago

What more is needed for the patch @dkadashev created to be implemented into Liburing? I would really like to start using this feature.

dkadashev commented 3 years ago

This is in progress as far as I know, here is the latest version of the patch (not merged yet) to add getdents support to io-uring: https://lore.kernel.org/io-uring/YGMIwcxAIJPAWGLu@wantstofly.org/

YoSTEALTH commented 3 years ago

Thanks @dkadashev for the progress update link. Hopefully it will be merged and in Liburing soon :)

YoSTEALTH commented 3 years ago

@dkadashev looking at https://krautbox.wantstofly.org/~buytenh/uringfind-v3.c I am curious, have you tried using io_uring_prep_openat2 with RESOLVE_CACHED enabled? @axboe implemented historic LOOKUP_CACHED into RESOLVE_CACHED to make file-name look up a lot faster! https://lwn.net/Articles/843163/ this might also improve io_uring_prep_getdents

dkadashev commented 3 years ago

@dkadashev looking at https://krautbox.wantstofly.org/~buytenh/uringfind-v3.c I am curious, have you tried using io_uring_prep_openat2 with RESOLVE_CACHED enabled? @axboe implemented historic LOOKUP_CACHED into RESOLVE_CACHED to make file-name look up a lot faster! https://lwn.net/Articles/843163/ this might also improve io_uring_prep_getdents

I think there is some confusion here, I have nothing to do with the getdents patch, it was submitted by another person (Lennert Buytenhek buytenh@wantstofly.org)

YoSTEALTH commented 3 years ago

@dkadashev o I c, my bad.

ammarfaizi2 commented 2 years ago

We have IORING_OP_GETDENTS now.

Link: https://github.com/axboe/liburing/compare/56c7bb43bdac67acd808cd88a507c78ac6c56259...eeeaa3ed4ce574fd4ddfa13738e7ca000cc14449

Should we close this or wait until Linux 5.17 becomes stable?

markg85 commented 2 years ago

Just for "fun", is there a benchmark of using getdents on a large folder? Say a folder with 500.000 files in it or even more? I'm curious to know how much faster it is to get a folder listing of such a large folder via uring as opposed to regular syscalls and perhaps readdir.

Ignore the fact that folders with 500k entries usually don't exist. I used that number in the past to benchmark listing performance of Dolphin (KDE's file browser) and to find bottlenecks that only begin to show up when you use insanely large folders.

I would've tried this myself if i had a kernel with getdents in uring. But i'm not adventurous enough to rebuild the kernel with this patched in.

YoSTEALTH commented 2 years ago

What exactly is happening with this io_uring_prep_getdents? Looks like it was added than removed! Is this going to be in Linux 5.18?

axboe commented 2 years ago

It was never in a release. There were various concerns on the API, so it was dropped for now.

notfed commented 2 years ago

@axboe: Do you happen to have links to the discussions you're referring to? (Or are they the ones linked to in this thread?) Are there fundamental concerns, or are we just hung up on nitpicky details?

It'd be really nice to have this. The ability to potentially navigate an entire filesystem using only io_uring seems powerful and IMHO understated.

axboe commented 2 years ago

This thread should highlight it:

https://lore.kernel.org/io-uring/20211221164004.119663-1-shr@fb.com/

YoSTEALTH commented 2 years ago

https://lore.kernel.org/io-uring/20211221164004.119663-1-shr@fb.com/

It would be better to help improve the code then to say its a bug.

axboe commented 2 years ago

Closing this one again, don't see us adding this anytime soon. If someone wants to drive this issue, do feel free to take it over and do that. But the core team isn't going to do so.

YoSTEALTH commented 5 months ago

@axboe would it be possible to only add something like:

static inline void io_uring_prep_getdents(struct io_uring_sqe *sqe, int fd, void *buf, size_t count)
{
    io_uring_prep_rw(IORING_OP_GETDENTS, sqe, fd, buf, count);
}

This way getdents64 can be called through io_uring in async.

Forget about the whole offset issues and trying to include full directory implementation. Developers can write their own thing.

extern __ssize_t getdents64 (int __fd, void *__buffer, size_t __length)