genodelabs / genode

Genode OS Framework
https://genode.org/
Other
1.04k stars 246 forks source link

ioctl: support BSD-style extattr #4346

Closed ttcoder closed 1 year ago

ttcoder commented 2 years ago

I mean to implement special ioctl's in a couple VFS plug-ins I'm writing for Genode (ntfs, attr indexer, among others).

I just realized though, Genode never calls my ioctl() hook in my plug-in, there is a bit of "gate keeping" going on, it checks that the ioctl "opcode" relates to terminal geometry or opensound, and since it relates to neither, it bails out

via 1694 here https://github.com/genodelabs/genode/blob/4188427596f4f6a9b671f73fbf56d0be79352649/repos/libports/src/lib/libc/vfs_plugin.cc#L1694

and line 1783 here https://github.com/genodelabs/genode/blob/4188427596f4f6a9b671f73fbf56d0be79352649/repos/libports/src/lib/libc/vfs_plugin.cc#L1783

One way to solve this, would be to make Genode/libc more open/extendable : when it stumbles on an opcode it does not know how to handle, it delegates to the concerned VFS plug-in ; currently the plug-ins already default to IOCTL_ERR_INVALID in the base implementation of the ioctl() hook, so there would be little or not change, but those in need of a different hook would be able to implement new ioctl opcodes.

Possible pathways from here... Should I:

Muchas gracias as always :-)

nfeske commented 2 years ago

Regarding ioctl operations, we actually went into the opposite direction. Issue https://github.com/genodelabs/genode/issues/3519 condenses our approach.

For the background story, when creating the first version of the VFS for the Noux runtime, I added ioctl as operation because I wanted to mimic Unix in the most straight-forward way possible. But this decision was misguided and haunted us for a long time. We even discussed how to integrate ioctl support into the file-system session. I shudder when thinking of it.

Once we realized the mistake, we made an effort to switch away from ioctl at the VFS level. Of course, we have to preserve the illusion of ioctl operations being available at the libc API level. The solution is having the libc translate ioctl calls into file operations.

To get an understanding of the interplay, I recommend following the TIOCGWINSZ ioctl as handled by the terminal VFS plugin. The two most interesting commits are https://github.com/genodelabs/genode/commit/7ac32ea60c703a73477c121ab8269dfad201f428 and https://github.com/genodelabs/genode/commit/d516515c7a2840f265cdcc916038bc5020b3dc6e.

Nowadays, VFS plugins are using the clean approach using pseudo files. Closely related, we even went as far as mapping all socket operations to pseudo files (similar to plan9). Also, POSIX signal delivery is now handled by watching pseudo files, which I find quite elegant (https://github.com/genodelabs/genode/issues/3546). The ioctl method you found in the VFS is merely a residual piece of dirt that we missed sanding away.

Coming back to your planned addition of extended file semantics, I think the best way to go is

  1. First coming up with a concrete list of ioctl operations you want to support along with how they should be named at the libc API level. I guess that you already have some patterns in mind, possibly from Haiku?
  2. Then, the next step would be thinking about how those operations could appropriately be expressed as read/write interactions with pseudo files. This part would ultimately become part of the libc.
  3. Once that has become clear, those pseudo files could be added to the VFS plugin that supports the specific ioctls.

The result will be a sustainable solution that properly supports both the mounting of the VFS plugin local to an application and to a VFS server (connected to the application via a file-system session).

Of course you may patch Genode as much as you like to make it fit for your purpose. :-) If you need a short-term fix a specific feature that's probably preferable. When looking further down the road, however, it would be nice if you keep the pseudo-file approach in the back of your head.

ttcoder commented 2 years ago

Ah so that's the new way to do things, that I stumbled upon several times the past couple weeks, without quite understanding it :-) I think I got it ;

Starting from the beginning: the idea would be to emulate e.g. the FreeBSD extattr API, which allows to create/set/get/remove/enumerate xattrs with just four different functions.

Side-note : compared to the Be API, functions within that API lack one argument (they consider attribute name and attribute value, but not attribute type, which they always assume to be "string"), but BeOS/Haiku have quite a bit of experience working around that, by hack-encoding the attribute type within the "name"... and dealing with these weird "name namespaces" that the rest of the world seems to use... And anyway I'm most likely targetting NTFS et alia, which have no notion of attribute type either ; so yea let's go with FreeBSD/GNU style API.

(example one two )


From what I understand I would need to do this:

a) Application layer: my TTS apps would remain unchanged, still do this:

BFile file("/boot/Music/PinkFloyd/TheWall.mp3")
file.WriteAttrString("Audio:Artist", "Pink Floyd")  // creates (or clobbers, if pre-existing)
or ReadAttrString..  // returns the attr value (or empty string, if does not exist)
or RemoveAttr..  // deletes the attribute
or ListAttributes..  // returns the full list of attr names (e.g. a vector<string> if you will)

b) Haiku-on-Genode layer: in my library that makes apps compatible with Genode, I would implement BFile/BNode/Storage Kit to invoke ioctl(), as in:

BNode::WriteAttrString(attr_name, attr_value)
{
    status = ioctl(fd, XATTR_SET, attr_name, attr_value)  // or instead of 2 args past the opcode, pass just one pointer to one struct that holds both args ?
}
or XATTR_GET  (how to allocate the destination buffer to which the attr value will be copied, do we need XATTR_STAT or some such ?)
or XATTR_RM
or XATTR_LIST

(calls to file.ReadAttr(B_INT32_TYPE, &myint...) would be implemented under-the-hood with string vals)

c) libc/plugins/vfs would need to be modified to accept those 4 new XATTR_* opcodes (?)

What would be the "device" that the ioctls would apply on ? For my use, all CONTINUOUS_FILES of a given disk device/partition should be able to have any number of (string) xattrs ; though it would be very very nice if symlinks and (especially) directory nodes could have attributes too.

d) my VFS plug-ins would provide a Compound_file_system with an "Info" member, to support the above ?

If the single-file-system's have to be scalar (cannot be lists of strings), maybe we would need a notion of "currently-selected" xattr ; this would impact the API upstream a little, it would then look like:

ioctl(fd, XATTR_SELECT_ATTR, "Audio:Artist")  // or ioctl(fd, XATTR_SELECT_ATTR, 0) ?
ioctl(fd, XATTR_SET_CURRENT, "Pink Floyd")

And then the VFS plug-in would provide:

Value_file_system<unsigned> &_index_of_currently_selected_attr;
Value_file_system<string> &_currentattr_name;
Value_file_system<string> &_currentattr_value;

I didn't quite wrap my head around it all yet, so the above might not make sense. I see some references to XML for instance, in the OpenSound ioctl code, not sure if it would apply here as well.


EDIT: actually, this seems to be the FreeBSD extattr header, if I'm not mistaken... Which (if I may be so bold :-) seems to be weirdly complex for what little it does, strange... https://github.com/lattera/freebsd/blob/master/sys/sys/extattr.h

nfeske commented 2 years ago

I think you got a pretty clear picture. Thanks for taking the time to write up your reflection.

I'm unfamiliar with extended file-system attributes. So please take my advice with a grain of salt. When attaching meta information per file/directory/symlink, wouldn't one reasonable way be adding a companion pseudo directory for each file/directory/symlink? E.g., the artist of /boot/Music/PinkFloyd/TheWall.mp3 could be stored at /boot/Music/PinkFloyd/.TheWall.mp3/attr/title (note the dot preceding TheWall..)? Providing one ioctl directory per inode has two advantages over placing one ioctl directory at the root of the file system:

Regarding the implementation of the idea, in contrast to the rather static device-like VFS plugins (terminal, OSS, block), I don't think that the pseudo ioctl directories/files need to actually exist in form of C++ objects in our case. They could just be pulled out of thin air whenever the libc application wants to interact with them. E.g., for each file/symlink/directory, the VFS plugin would magically report two directory entries instead of one - one referring to the actual file and one directory prefixed with the dot that contains the setters/getters of attributes as pseudo files.

You know, we regularly use XML to garnish things in Genode just to trigger people. Just joking. In some cases, it is useful to let pseudo files return structured information that can be easily parsed. Keep in mind that there is at most only one client interpreting those pseudo file - our libc. In principle such structured information could be represented as multiple files. But in practice, parsing one structured file is simpler and multiple values can easily be delivered as one transactional change. E.g., terminal width and height are supplied as one change of the .terminal/info file whenever a terminal-window is resized.

ttcoder commented 2 years ago

Agreed on consistency-by-proximity for sure. This is a major reason for storing metadata in xattrs (instead of e.g. in a separate SQL database) : making sure that if the file (node) is renamed, then the metadata follows, if the node is deleted then the metadata is deleted with it, if it's moved then it's moved, and so on. So localizing the metadata as close as possible to its node sounds good to me.

Seems we're getting close. Here's my take on the different implementation alternatives:

For the XML part: the perspective I have, is about extending support (now now, but some day) not just to "string" metadata, but also to Album Art/Cover Art, in the form of PNG of JPEG files of a small size, probably less than 10 KB.

For the third part: I guess this would not have an impact on performance, this code would only ever be exercised when the libc receives an ioctl with the corresponding XATTR_* opcode... Something I have to keep in mind too : on any ioctl() call, the file descriptor parameter passed to it would have to be that of the "dot" hidden/virtual file/node-counterpart, not of the file descriptor of the file/node itself.

I'm not sure if it would be up to my layer to handle the distinction, i.e.

BNode::WriteAttrString("song.mp3", ..)
{
    BNode node2;
    node2.Open( ".song.mp3" )
    ioctl(node2.Fd(), XATTR_SET, ..)
}

Or if the subtlety would be handled down in the libc, i.e. libc would be the one aware of the hidden connection between the real physical node and the virtual "dot" node.

Oh, something that comes to mind, what of actual "dot files" (physical, not virtual) ? Such files can be created by the end-user, or are (mandatorily) created by linux-style software like git, fossil, etc.... I'm tempted to say they wouldn't create trouble, in fact the scheme we're discussing is "grammar clean", i.e. no matter the number of dots prefixing a filename, we just add an additional dot to access its metadata, so no part of the code will confuse data and metadata, and (no that it matters much) we could even imagine that linux-style "dot" files can have extended attributes just like any other files if they like.

nfeske commented 2 years ago

My remark about XML was merely a justification, not a suggestion for your use case. Intuitively, if there are potentially multiple attributes that are to be accessed individually, I'd probably represent each piece of data in a different file (named after the attribute name), not using any syntax at all.

But this does not need to be an either-or-decision. Since we are speaking of pseudo files, they are cheap. So both representations can even exist in harmony. E.g., the terminal VFS plugin (https://github.com/genodelabs/genode/blob/master/repos/os/src/lib/vfs/terminal_file_system.h#L309) provides a .terminal/info file with XML data as well as the individual files .terminal/rows, .terminal/columns, .terminal/interrupts that contain mere numbers.

Exposing dot files to the world, via opendir/readdir/rewinddir(): a big no-no :-)

That's a reaction I can understand. On Unix, dot files (or directories) are commonly considered as informally hidden. So this technicality does not stick into the eyes of users at the GUI level normally. In any case, the listing of ioctl pseudo files could even be made configurable at the VFS plugin node. The decision could be up to the user.

I imagine that the libc's ioctl() would have to attempt to open them "in the blind"

That would work fine too.

Or if the subtlety would be handled down in the libc, i.e. libc would be the one aware of the hidden connection between the real physical node and the virtual "dot" node.

That's it! The mechanism remains transparent to the application. E.g., the ioctl for obtaining the terminal size is issued on /dev/terminal and does not know/care about the presence of /dev/.terminal/. The libc turns this operation into read/write of the content of the ioctl pseudo directory (see https://github.com/genodelabs/genode/blob/master/repos/libports/src/lib/libc/internal/vfs_plugin.h#L52 how this path is obtained from a libc file descriptor).

Oh, something that comes to mind, what of actual "dot files" (physical, not virtual) ?

As an ioctl for setting an attribute would indeed actually be a plain write to, e.g., /boot/Music/PinkFloyd/.Animals.mp3/attr/artist, a file system w/o support for attributes could still handle it. The operation would then result in a plain file with the attribute value as content. In contrast, a file system with attribute support would turn this write operation into a special file-system operation. The latter could be seen as an optimization of the former.

I find this thought experiment quite interesting when looking at file archivers like zip. The attributes - being represented as plain files - could in principle be archived together with the actual files even if the archiver does not support attributes. Well, in practice, one would be faced with corner cases such as the order of extracting files, like what happens when an attribute file is extracted before its corresponding real file. Sorry, I digressed ;-)

In my opinion, there is some kind of beauty to representing attributes as plain and visible files, merely residing in an (informally hidden) dot directory. But as I said above, I'm a novice about using file-system attributes.

ttcoder commented 2 years ago

Really enjoying our daily conversations!

'Roger that' on making the expose-to-world setting configurable, makes sense.

So I had more to chew on today, but following your directions made it easy enough, and it works as expected, my log shows I'm hitting all the right marks, and I'll soon need to implement actual storage with the NTFS back-end code.

I'm also preparing a little write-up about "name clashes" ; though there's not all that much to say about it really: it's just that if someone has a ".git/" folder, and accidentally (or maliciously) creates a "git" file next to it, the plug-in will decide to treat any access to ".git" as an access to the actual physical node of that name, and not as an ioctl request on "git", thus the xattrs of "git" will not be read/writeable, they will be "shadowed" (until the node named "git" gets renamed, our moved out elsewhere). I guesstimate there's no need to get too deep thinking about that scenario, since it probably doesn't matter much in normal use.

Zip is an interesting choice of example... Because it already perfectly handles extended attributes, not just BSD or Linux style, but even BeOS-style (which can run in the dozens of KB, unlike their peers). When someone in e.g. Windows opens the zip, the xattrs are silently ignored, the end user does not even realize they exist, whereas if someone with BFS opens the zip, the xattrs gets applied in full. Very elegant. It's been the archival format of choice for Be derivatives since the late 1990s for that very reason ;-)

For the anecdotal value: I've also had a Mac enthusiast send me zip archives, of his "resources forks"... There it's a little less enjoyable, because the expanded archive spills out lots of subfolders named "___MACOS_rsrc" or some such, as I remember (would have to check).

The "emulate attributes as deep folder hierarchy" idea, hmmm... I'm not sure I will have the time to go that deep, as it would require lots of mkdir() and such code ; above all, my concern is that it would mean the association would be lost (between e.g. "myfile" and ".myfile/") if "myfile" is renamed or moved around to someplace else on the hard drive... It would be fragile, break the consistency principle. Although that, too, might be solved with enough coding and complexification of the plug-in (!)

You'll see in the vfs plug-in that I've had to refactor the path split/break-down/handling into a separate class, since the logic needs to be applied in several hooks, including directory(), open(), read(), etc.

This can be perceived from the Log:

[init -> test-libc_vfs] calling ioctl(fd, XATTR_SET, &tx_pair) ioctl stuff.............
[init -> test-libc_vfs] ---->directory path=/testdir/.test.tst/attr/userspace:Audio:Artist
[init -> test-libc_vfs] ---->directory path=/testdir/.test.tst/attr/userspace:Audio:Artist
[init -> test-libc_vfs] ---->open path=/testdir/.test.tst/attr/userspace:Audio:Artist mode=2
[init -> test-libc_vfs]   => VirtualizedXattr path=/testdir/.test.tst/attr/userspace:Audio:Artist
[init -> test-libc_vfs]   litteral=attr dot_leafname=.test.tst attr_name=userspace:Audio:Artist
[init -> test-libc_vfs] this is probably an ioctl, so deal with parent
[init -> test-libc_vfs] deal with: /testdir/test.tst
[init -> test-libc_vfs] Yes, the physical counterpart does exist, so perform ioctl!
[init -> test-libc_vfs]   ..=> VirtualizedXattr Success, We'll handle this as an xattr
[init -> test-libc_vfs] Warning: should open handle on attribute, but N/I yet
[init -> test-libc_vfs] ioctl(fd, XATTR_SET, &tx_pair) failed, ret=-1, errno=45
[init -> test-libc_vfs] Error: Uncaught exception of type 'Test_failed'

Today's attachements: LibC was very simple to change indeed: ticket-early-diff.txt

Edit: the "sizeof(pair->out_value));" is obviously bogus, will always return sizeof(voidptr)

The plug-in has to "get involved" (commited ?) a little more: (you know the joke about how a chicken gets "involved" but a pig gets "commited", to an egg-and-bacon breakfast) vfs_fuse.cc.zip

EDIT2: While reviewing the day, I realize I almost made an orientation mistake for tomorrow's work: I should not call the NTFS code directly, since vfs_fuse.cc is the "fuse abstract" part, the file that is abstract from the underlying (ext2fs, exfat, ntfs..) back-end ; so in the open() etc hooks I should call fuse::op()->setxattr() and friends (the FUSE API already has provisions for xattr) and look into implementing them, including the back-end specific (ntfs...) implementation.

Edit3: There, NTFS xattr work :-). Just had to set compile flag "-DHAVE_SETXATTR", create a dummy sys/xattr.h header file, switch "stream types" in init.cc, and now I can call fuse.op.setxattr() etc from the VFS plug-in and read/write extended attributes to the test ntfs file system, yay.

nfeske commented 2 years ago

Really enjoying our daily conversations!

Likewise. It is very enjoyable to watch your enthusiasm at work. :-)

I'm amazed about your rapid success. Seeing "NTFS xattr work" so shortly after we sketched the rough plan is super cool.

I'm not sure I will have the time to go that deep [...]

You are certainly right. I completely missed the many possible consistency issues of the various file operations like unlink or rename.

ttcoder commented 2 years ago

One last-ish note about ioctl opcodes:

My second and last (?) ioctl related project will need to add yet another few opcodes such as TTS_CREATE_INDEX, TTS_DELETE_INDEX, TTS_QUERY_BY_FORMULA etc. Those ones will be device-specific (i.e. I"ll call ioctl() on the device, that is to say on the "/" root directory of the device I think, since an attribute index is a partition-wide thingy) (this will be for a virtual device running on top of the Fuse-ntfs device) (I can elaborate if there is interest).

As per our discussion, new ioctl opcodes are rejected by libc unless they are either upstreamed, or attack a forked upstream. I'm probably headed to do the latter.

I'm mentionning it here to foster situational awareness, but those features are likely to be excessively business-specific logic, unfit for being upstreamed to Genode and being included in-core (maybe even for Genode-world, I think), so that'll be my first foray into making a clean, enduring (rather than temporary) Genode patch, to be shipped with the rest of the h-o-g project.

(unless the same thing can be accomplished more cleanly with a "libc plugin" instead? didn't think of it that way before)

Edit: while I'm brainstorming ^^... Another way to look at it is, do away with ioctl() completely : since ioctl just resolves to FS ops anyway (on dot-files especially), I could try to do FS ops on dot files right at the libc level already, then my code is portable against any future version of Genode...

@Norman hopefully this is the last headache I give you until next year, I promise to be a good boy now ^^

nfeske commented 2 years ago

I appreciate that you are putting yourself in our shoes. In the back of my head, I indeed perceive ioctl as a "legacy" interface that we need to provide to attain compatibility with the existing body of POSIX software. Supporting broadly used opcodes is valuable because it lowers the friction between Genode and those broad uses. E.g., the support for the ioctls for OSS nicely bridges the gap between many existing audio applications and our system. Conversely, the value of a completely new opcode that is not used by any existing software (not because it would be bad, but just because it's new) is not very tangible.

That said, please don't be shy to propose changes. If there are solid arguments that speak for new ioctl opcodes - in particular with the prospect of finding adoption outside of Genode - we should definitely not dismiss changes per se.

Another way to look at it is, do away with ioctl() completely : since ioctl just resolves to FS ops anyway (on dot-files especially), I could try to do FS ops on dot files right at the libc level already, then my code is portable against any future version of Genode...

I wholeheartedly agree. If I understand you correctly, both ends of the protocol (the application side and the driver side) are solely spoken by your custom software. In this case, there is no apparent benefit in tunneling those interactions through ioctl calls. I think it is best to make use of the flexibility of Genode's VFS-plugin concept instead, which makes is so easy (relatively spoken) to model custom device interfaces as pseudo file-system hierarchies. No need for negotiating ioctl opcodes. And as another plus, more transparency for the user (compared to ioctl, which has somehow subverted the beautiful Unix mantra of "everything is a file").

ttcoder commented 2 years ago

Tentatively closing this ticket, as I verified I can access xattr data with special "dot" paths from LibC, without modifying Genode's ioctl opcodes support indeed. It was a little more involved than I thought, as Genode's VFS unexpectedly changes its mechanics completely between open() being called from vfs and open() being called from libc, it calls different hooks in my plug-in (it calls stat() several times on each component of the path instead of calling directory() on the full ready-to-parse path), etc, but didn't take long to adapt anyway.

Independantly of the above discussion, I'm thinking that I'll leave xattr stuff out of ticket genode-world:193 (FUSE support in Genode). I'll let Josef finish the FUSE plug-in as-is, lean and simple. Then I'll create a "fork" of Genode's FUSE plug-in with all my xattr code on top of it. (or if that's too hasty, tell me so and we'll discuss that aspect in ticket 193).

I'll probably be more "pushy" for the ntfs-3g port though, since the modification I need there only consist of two lines of code to change :-) (i.e. optionally change NF_STREAMS_INTERFACE_NONE to NF_STREAMS_INTERFACE_XATTR, and add a #define HAVE_SETXATTR in makefiles). That would be "nice to have" as such changes won't harm other users of ntfs-3g, but will allow me to use NTFS with native xattr "out of the box" without patching it. We'll see.

nfeske commented 2 years ago

it calls stat() several times on each component of the path instead

That's done to deference symlinks along the path. The current implementation is indeed not ideal. It would be sensible to move this mechanism from the libc into the VFS, mainly for performance reasons. I mentioned this idea in https://lists.genode.org/pipermail/users/2021-May/007687.html but haven't got the time to implement it yet.

Please feel welcome re-open this issue once you feel that the xattr changes in the libc's ioctl handling are desired.

ttcoder commented 2 years ago

I might just take you up on that offer! Re-opening so as to remember to examine this further.

Indeed, moving forward on the haiku-in-genode side of the code, I ran into the "file descriptor is known, but file path is not known" problem again. Except this time I couldn't circumvent it with their "libroot-build" support and hacks, as FD/Path book-keeping seems to be absent in the xattr part of the code.

So I'll try it this way (the ioctl way), see if I can spare myself a re-architecture of the haiku-libroot-build-hacks code :-)

My tentative roadmap for the next few days/weeks...

1) get my code in shape, to make it work end-to-end (from up top in BFile/BNode, down to silicon, with ntfs-3g driver actually writing the xattr data to disk) ; I might have a (brittle) such chain done tonight, and get it consolidated next week, we'll see. 2) commit the code at my end (haiku-in-genode), and showcase it here, to show how I implement the BSD-style getattr/setattr functions with ioctl. 3) post a diff showing how I implemented the ioctl in vfs_plugin.cc, to (potentially) validate it as a simple, self-contained, short-and-sweet piece of code. The discussion might take a few iterations as I'm not sure which include header should contain the constants (#define XATTR_SET or rather enum ?), we'll see.

Hopefully everyone wins, as I believe xattr support might find other uses in the Genode community, outside of my narrow project... So I want to encourage you to encourage me to get xattr support in Genode (grin) !

P.S. for the anecdotical value : while grepping through Genode/contrib, I stumbled on a use of xattr I wasn't aware of : libarchive has fairly extensive xattr support, including for BSD-style API (see ARCHIVE_XATTR_FREEBSD). So I guess that makes potentially three archivers in reach, including GNU's tar which I seem to remember does the same : 1) info-Zip, 2) libarchive 3) tar.

PPS. I'm not worried about VFS performance for my use case. If I get the skills in the future, I might try and contribute profiling and such for the sake of the community and for pursuing the goal of self-hosted Genode able to run gcc efficiently, but for now it all feels good.

EDIT: I wonder if it would be interesting to explore an alternative way, whereby Genode xattr support is implemented with BSD-style getxattr_link() and getxatr_fd() functions located at the right place (in libports ?) such that they have access to FD/path book-keeping, allowing them to map an fd to a path directly and circumvent the need to create new ioctl opcodes. Hmmm I'm probably over-thinking this and complicating matters, I'll shut up and return to my code ^^

Edit2: there, the whole chain (from BNode to ntfs-3g) works ; the most work was with return code handling and errno handling, let's hope I got it right ; will turn my attention to the last two opcodes (for listattr and rmattr), and then prepare a patch proposal.

ttcoder commented 2 years ago

Almost there, one last opcode to implement/test, and the diffs to prepare.

In the meantime, here's a digression you might enjoy. I was thinking about our discussion on how to implement xattr support in the most elegant way, keeping with the "sub-folder of attributes" paradigm, and it occured to me, we could go much farther than that, and theorize a more far-reaching revolution... In the VFS itself.

The idea is to.... drum-rolls... Get rid of the distinction between directories and plain files. A file system would no longer be made of dirs and files, but of "nodes", combining the features of both dirs and files : any node can have any number of children and/or hold any quantity of data. Hence, Genode would support traditional file systems by adding the constraints : "a directory is a node with just children but no data" and "a plain file is a node with just data but no children". But it would also be able to come up with its own "everything is a node" revolutionary file system, and maybe it would also impact support for xattr-capable file systems, since they would be just a special case of its more general, more elegant native solution.

This theory would need to be fleshed out and probed for potential pitfalls... (Edit:) some of the bigger question marks would be about a Desktop OS based on such a file arborescence : how to present to the desktop user "nodes" that can hold both data and sub-nodes ? But if it does have merit, it might be something that Genode wants to 'borrow' someday ;) It would keep with the Genode flexibility spirit, and be something that other operating system would have more trouble implementing, beig less multi-faceted designs. But that 'unique-ness' would also discourage using that set of native-specific feature, since such a file system would not inter-operate well with the rest of the world.... Unless we found a way like Be Inc had done in the 1990s and find an archiver (info-zip ?) that is flexible enough to archive at our end and unpack at the other end... To be sure, that perspective kind of of dampens my enthusiasm.

On the other hand, NTFS might inter-operate with such a paradigm, to some extent. Fom what I understand of it so far, it seems NTFS somewhat gets close to that paradigm with its "named streams" concept, but the above proposal is a more complete generalization, to be applied at the VFS level, not just in the fs back-end. I keep being surprised by that file system BTW, it seems to have been designed by a team at MS that was more.... skilled than the rest of the MS outfit ^^. I seem to recall they were a team made of ex VAX or VMS engineers. Anyway that's it for the off-the-cuff ttcoder's-crazy-idea-of-the-day.

Actually -- the more I think about it, the more I think Genode's VFS layer might already allow for calling both opendir() and open() on the same node, I would have to test. Maybe "all" we need would be just the stroage fs backend, to have this idea come to life. If time permits I'll check out whether NTFS allows writing data (i.e. "named streams" in their parlance) to directories .

Edit: well it seems so... Or at least for an "xattr" named stream (which should have no bearing on the ntfs question), I can open a directory named "boot" and write an "audio:artist" named stream to it, and read it back.

ttcoder commented 2 years ago

So here's the diff for vfs_plugin.cc:

diff-1_ticket4346_ioctl.txt

Pre requisites:

My comments are probably too verbose, and maybe I should not call unlink() directly, but this gives a good idea of how to patch Genode, for those who want to add the feature.

File test/main.cc provided FYI only, its tests would fail for all VFS plug-ins except the one I tested with (ntfs-3g with special patches) so this test-harness change is not suitable in the general case.

Here's a real world example of code that calls the new ioctls to provide xattr support to applications: https://chiselapp.com/user/ttcoder/repository/genode-haiku/info/a0455de065f4e6c9

EDIT: After running more tests, it turns out that I was missing some 'errno' management in the XATTR_GET case ; here's the updated patch:

diff-2_ticket4346_ioctl.txt

Edit 2:

Found another discrepancy (in XATTR_LIST), this 3rd version makes errno handling closer to standard and BSD:

diff-3_ticket4346_ioctl.txt

nfeske commented 2 years ago

Thanks @ttcoder for keeping this issue updated. I haven't looked very close to your libc patch but it looks not very invasive to me, which is good.

Regarding your idea to merged the notion of files and directories, it looks interesting. So far, the design and features were primarily steered by the needs of POSIX. The distinction of files from directories just fell into place with out much questioning and stayed this way. It is really nice that you feel the urge to innovate. I'd say, give it a try! :-)

Maybe there is a chance for further simplification? It is hard to tell.

To test if the idea is viable in practice, one would probably need to try emulating all the semantics expected by POSIX software, answering questions like: How does the stat'ing of a directory work? Or how do symlinks fit into the scheme? Or if multiple file systems are mounted side by side with each offering the same directory, how will the merged list of the content of both places be generated? Would that be consistent with the role of regular files in a union mount that would shadow each other rather than appear as being concatenated?

ttcoder commented 1 year ago

Now that I'm at the "build a distro" stage I'm doing the rounds, trying to tie up loose ends, and ended up in this ticket.

It occured to me that the patch can be reduced to its simpliest form : just add an accessor that converts an fd to a filepath -- after all that's the primary job that ioctl() was really doing for me. So here's my patch, in case someone else needs to implement dot-paths magic in the future:

$ git diff repos/libports/lib/symbols/libc repos/libports/src/lib/libc/vfs_plugin.cc
diff --git a/repos/libports/lib/symbols/libc b/repos/libports/lib/symbols/libc
index e3de45d2d..95a088bdf 100644
--- a/repos/libports/lib/symbols/libc
+++ b/repos/libports/lib/symbols/libc
@@ -485,6 +485,7 @@ opterr D 4
 optind D 4
 optopt B 4
 optreset B 4
+path_for_genode_fd T
 pathconf W
 pause W
 pclose T
diff --git a/repos/libports/src/lib/libc/vfs_plugin.cc b/repos/libports/src/lib/libc/vfs_plugin.cc
index 89db7eab7..ece089d9e 100644
--- a/repos/libports/src/lib/libc/vfs_plugin.cc
+++ b/repos/libports/src/lib/libc/vfs_plugin.cc
@@ -224,6 +224,27 @@ namespace Libc {
 }

+#if 1  // HoG_GENODE
+//
+// Small-footprint 'private' accessor, which hai-on-genode uses to look-up fd's
+// (file descriptors) paths, without using ioctl(). This way, no need to patch ioctl() itself.
+//
+extern "C" int path_for_genode_fd( int fd, char * buf_into )//Genode::String<1024> into );
+{
+       using namespace Libc;
+
+       File_descriptor *fdesc = file_descriptor_allocator()->find_by_libc_fd(fd);
+       if (fdesc == 0)
+               return ENOTSUP;
+
+       //into = fdesc->fd_path;  //xx does not work, string remains empty for some reason, let's use an old-fashioned C buf for now
+       copy_cstring(buf_into, fdesc->fd_path, 1024);
+
+       return 0;
+}
+#endif  // ~HoG_GENODE
+
+
 /*
  * This function must be called in entrypoint context only.
  */

Initially I stumbled into something: the new accessor was not found at linking stage. By looking around I realized I need to add it to the "export" list it seems. Didn't add a thread-safety mutex as it does not seem to apply in that part of the code.

My test-suite is green across the board, everything looks good. I believe we can close this ticket :-) It's resolved to my satisfaction, with a two-liner I can easily add to any Genode release, or anyone else who might need FD-derived paths to create "dot paths".

nfeske commented 1 year ago

If you are fine with maintaining this little patch outside of Genode's source tree, I'd prefer that since the feature is not used by any of Genode's components. Should this become too inconvenient for you, please tell. So we can integrate it along with a regression test (otherwise it might be prone to be accidentally removed).

//into = fdesc->fd_path; //xx does not work, string remains empty for some reason

I guess the reason is that you passed into as value, not as reference. So your assignment merely modified the copy of the String within the local function scope.

Didn't add a thread-safety mutex as it does not seem to apply in that part of the code.

That's at least consistent with the other functions in file_operations.cc, which are prone to racing with close calls (for the same file descriptor) issued by another pthread. Since these corner cases are not fully addressed, we keep #725 open.

ttcoder commented 1 year ago

Dang, I must have been very tired when I wrote that pass-by-value :-p. Restored the code, added the missing "&" ampersand in my repo, and tested. Thanks much for the hint. Yeah I'm glad it ends up with such a small patch that takes just a few seconds to apply. My initial intuition of using an ioctl() does not make as much sense -- though I did learn useful things going down that road. Also I like to think that I try to follow the example of the Genode team, which does not hesitate to throw away suboptimal code, is ruthless in always looking for the more elegant solution :-) I'll post to the "usb" and "hda" tickets in the coming weeks as I make progress. EDIT -- just solved my NTFS problems too, yay