Closed mon closed 1 month ago
NTFS Extended Attributes are certainly a niche topic and is typically only really used by Windows components but I do not equate them to Alternate Data Streams. EA are a lot more limited than ADS and is thus reflected by the API whereas ADS can store large amounts of data and are essentially another file in that path.
Even if I wanted to change it, this would be a breaking change.
Understandable re: it being a breaking change, and thanks for the clarification. I discovered this morning that Cygwin treats xattrs as EAs as well, so there's not quite the consistency I thought there was.
Would you accept a PR adding a nicer API for reading ADSs? I already discovered a few odd footguns, such as attribute names with ':' in their title being transparently mapped to \uff20.
I can think of three ways to implement it
I'm on the fence about this, to me I think the normal Windows path format with \\server\share\file:ads_name
is consistent with how Windows itself treats ADS entries. You should be able to create/edit ADS entries with the smbclient.open_file
and smbclient.remove
should be able to delete them. The only API that's really missing is one to enumerate ADS names which we can certainly add if needed.
I already discovered a few odd footguns, such as attribute names with ':' in their title being transparently mapped to \uff20
Are you able to expand on this problem?
I think the normal Windows path format ... is consistent with how Windows itself treats ADS entries
Yeah that's fair, I'd be fine with that continuing to be the way to open ADS stuff.
Are you able to expand on this problem?
I spent an hour yesterday trying to find some source code to point at (especially because I want to know if it includes other characters), but came up empty. I'll look again tomorrow. I don't want to propose anything without having some code to point to.
Ok, I finally found the document explaining the character conversions, from an old Microsoft KB article that disappeared. https://mskb.pkisolutions.com/kb/117258
You'll note this doesn't mention the colon. That's explained by looking at the macOS NTFS source.
I'm unsure what the problem is sorry, I can potentially see the macOS client converting :
to a unicode characters mentioned but smbprotocol/SMB itself doesn't do anything special here. Doing :
in the filename will separate the filename and ADS entry as expected. Are you saying smbprotocol
will also do the conversion?
SMB doesn't do anything special here, you're right - what happens at the moment is it'll error when creating illegal filenames (assuming it's an NTFS server).
I finally found the right config option (mapchars) used in the Linux cifs mount driver (which implements SMB support), so it's not just macOS that implements this mapping. They started it though, software such as TrueNAS calls it "Apple Style Character Encoding".
Honestly though it's a bit of a mess (there's also a horrible 8.3 mapping option), and I understand if this isn't something you want to support. I would be totally happy with just implementing a "list streams for file" function and leave all this madness sitting in this issue.
I think adding the list makes sense to me. The client will work directly with the SMB protocol and doesn’t deal with any of the path char mappings other clients do. I don’t think trying to support the same magic behaviour here makes any sense.
Ultimately we provide a path to the server as is without any changes, it’s up to the SMB server to decide how to treat the chars in the path. For Windows the :
is valid in the filename if the underlying volume supports ADS (which NTFS does).
Alright, I'll get on that! An example list of streams:
:test_ads_stream:$DATA
:com.apple.provenance:$DATA
::$DATA
Should I return the names as-is, or split off the $DATA at the end?
As is, IIRC you can specify custom tags in the ADS (default is $DATA
) so just giving back the values that SMB returns makes more sense to me.
I've fallen into a bit of a rabbit hole today regarding xattrs.
There are currently two ways (connecting to a Windows SMB server, in this case my computer set to share a folder) to write extra attributes to files.
Alternate Data Streams, which uses the FileStreamInformation type
Also seen in the smbprotocol tests.
Extended Attributes, which uses the FileFullEaInformation type
The former is what
ntfs-3g
appears to map the concept of linux xattrs to (seen here), and is viewable using commandline tools on Windows:The latter, far as I can tell, is only retrievable by Windows C APIs such as NtSetEaFile (seen here).
Anecdotally, I can also confirm that macOS maps xattr calls over mounted SMB shares to the FileStreamInformation type.
In general, it's really very hard to find information on "NTFS Extended Attributes" because most people call Alternate Data Streams "Extended attributes" when they write about them. The best summary I found was part of an addendum to a SO question.
Based on what other clients and servers do, would it be reasonable to move smbclient's xattr implementation to instead use FileStreamInformation instead of FileFullEaInformation? Or did I miss something obvious!