c-blake / cligen

Nim library to infer/generate command-line-interfaces / option / argument parsing; Docs at
https://c-blake.github.io/cligen/
ISC License
501 stars 23 forks source link

examples/du.nim fails compilation on OS not having statx.h #156

Closed kaushalmodi closed 4 years ago

kaushalmodi commented 4 years ago

Hello,

I am on CentOS 7.

For my OS, haveStatx is false, and so I get this error:

/home/kmodi/sandbox/nim/cligen/cligen/statx.nim(229, 18) Error: undeclared identifier: 'statxMask'

Ref: https://github.com/c-blake/cligen/pull/140#issuecomment-647746990

c-blake commented 4 years ago

Yeah...I'm always running into portability issues with that statx layer, though its main purpose is portability. Complicating things is that it was available in kernels for over a year before glibc added a header file for it. I should really try its rules on several virtual machine images. I am on Gentoo and also test it on Debian/FreeBSD, but never on the rpm world. I'll look at this more closely in the morning. Thanks for the report.

c-blake commented 4 years ago

Let me know if that doesn't work for you. I think it should. It just may use fstatat instead of statx so you won't have birth times (if you even have an FSes with that 2017-ish feature).

Also note that to use https://github.com/c-blake/batch with cligen/dents.nim, as per comment in dents.nim, you must -d:batch --cincludes:$BAT_DIR/include with $BAT_DIR being wherever you installed the linux/batch.h header file. Things should all work fine without that, but you won't get speed-up.

Even with that, if the kernel module is not installed batch.h auto-detects this and operates in "emulation mode" where I just do the loop in user-space instead of the kernel. You probably want to do strace on a small test directory to know how things are working. batch hijacks the afs_syscall slot. Oh, and to time the impact of batching, you can always force emulation mode by setting BATCH_EMUL=anything (exported..so VAR=1 cmd or export VAR).

I just did an experiment on an i7-6700k and found a 3.3+x speed up for a read-write copy loop with 4096 byte buffers. Obviously there the workaround is to use a (much) bigger buffer, but stat has no such workaround. And on an AMD 2950x there was hardly any speed-up at all since those boot with a "nopti" due to the lack of a Meltdown bug. So, YMMV. In any given era, some CPUs will have very slow or very fast user -> kernel transitions.

Personally, I mostly think batch is an interface that "should always have been" in Unix (and probably in IPC and RPC stuff, though asynchrony is the usual workaround there). Having a way to assemble mini-programs changes the granularity/style of what make for useful interfaces to the kernel. It might be a ship that sailed long ago and crashed into the Isle of The Sirens, but Linux is also in a post-POSIX era of "screw it, we'll add what we want", like statx, actually. { Well, they may have always been in such an era if you follow kernel devel at all. :-) } I should try to do a similar module for FreeBSD and maybe OpenBSD. In some ways it is an almost trivial syscall.

kaushalmodi commented 4 years ago

Thanks! I confirm the fix.

Also note that to use https://github.com/c-blake/batch with cligen/dents.nim, as per comment in dents.nim

https://github.com/c-blake/batch/issues/1

$BAT_DIR being wherever you installed the linux/batch.h header file

I ran find to find if my system has that file, but apparently, I don't have it.

if you follow kernel devel at all

Nope, my hacking is only at user level because the only exposure to Linux has been through work where I don't have sudo access.

c-blake commented 4 years ago

Yeah. You won't have it until you install https://github.com/c-blake/batch which is just my own invention. It's ok. You just seemed interested in the fastest ftw in The West on the Nim forum and this facility is that, but also more general. If you like my du.nim for other reasons like its nicer help or the ability to abbreviate long option flags or get spellcheck on typos or whatever other cligen goodness then you can just compile it without -d:batch.

{ FWIW, something that "somewhat" bypasses the usual Unix permissions can be even faster. Namely if in the kernel you look up the stat data from the inode that you get back from getdents instead of from a pathname. Speed boost may well just be having a dev,ino key instead of a string key for the hash lookup. Technically, though, a user process isn't supposed to be able to stat/verify the existence/owner/times/size/etc. of some inode just by the inode number. It could be under a directory marked rwx --- --- and not owned by you, for example. But if you fudge that guarantee you can get an even faster ftw. ;-) I understand this is about as deep in the weeds of systems programming as it gets. }