python / cpython

The Python programming language
https://www.python.org/
Other
60.93k stars 29.41k forks source link

Figure out extended attributes on BSDs #57187

Open benjaminp opened 12 years ago

benjaminp commented 12 years ago
BPO 12978
Nosy @etrepum, @ronaldoussoren, @benjaminp, @ned-deily, @hynek, @koobs, @worr
PRs
  • python/cpython#1690
  • Files
  • initial_freebsd_xattrs.patch
  • extattrs.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['3.8'] title = 'Figure out extended attributes on BSDs' updated_at = user = 'https://github.com/benjaminp' ``` bugs.python.org fields: ```python activity = actor = 'ned.deily' assignee = 'none' closed = False closed_date = None closer = None components = [] creation = creator = 'benjamin.peterson' dependencies = [] files = ['39095', '40946'] hgrepos = [] issue_num = 12978 keywords = ['patch'] message_count = 12.0 messages = ['144028', '144035', '155505', '155756', '165551', '165578', '193732', '241392', '247000', '247308', '254089', '327502'] nosy_count = 10.0 nosy_names = ['nicholas.riley', 'bob.ippolito', 'ronaldoussoren', 'benjamin.peterson', 'ned.deily', 'Arfrever', 'hynek', 'koobs', 'worr', 'billyfoster'] pr_nums = ['1690'] priority = 'normal' resolution = None stage = 'patch review' status = 'open' superseder = None type = None url = 'https://bugs.python.org/issue12978' versions = ['Python 3.8'] ```

    Linked PRs

    benjaminp commented 12 years ago

    Extended attribute support currently exists in the os module for Linux. BSD's (including OSX) have a similar (but of course incompatible) interface. They should be exposed through the same functions. For example,

    os.getxattr("myfile", "user.whatever")

    should call on the C level

    getxattr("myfile", "user.whatever", value, sizeof(value), 0, 0);

    ned-deily commented 12 years ago

    Have you looked at Bob Ippolito's xattr module which has been out for some time and wraps Linux, OS X, BSD, and Solaris extended attributes?

    http://pypi.python.org/pypi/xattr

    d6f21f85-b290-44df-aedf-08939769be4d commented 12 years ago

    I've spent a few hours looking at xattr and the Linux/OS X (10.4+) implementations. Bob Ippolito's xattr module implements the OS X xattr interface on Linux, Solaris (9+) and FreeBSD. Linux and OS X are pretty close; FreeBSD and Solaris are substantially different from either and the Solaris implementation is somewhat incomplete/broken.

    The OS X differences from Linux are:

    • Instead of l* functions, the XATTR_NOFOLLOW option

    • XATTR_NOSECURITY and XATTR_NODEFAULT are in the headers but essentially unavailable as the kernel code always returns EINVAL for them.

    • XATTR_SHOWCOMPRESSION to expose the HFS compression stuff, which I can't imagine many people needing

    • XATTR_MAXNAMELEN (but no equivalent to XATTR_SIZE_MAX). Linux has a corresponding XATTR_NAME_MAX, which we should probably expose too.

    • XATTR_FINDERINFO_NAME and XATTR_RESOURCEFORK_NAME for some standard attribute names. I would imagine these are worth exposing.

    I don't see any problems supporting the currently exposed Linux API on OS X (I could probably find a usable value for XATTR_SIZE_MAX), but it's unclear if that is the right way to go forward.

    Suggestions?

    benjaminp commented 12 years ago

    2012/3/12 Nicholas Riley \report@bugs.python.org\:

    Nicholas Riley \com-python-bugs@sabi.net\ added the comment:

    I've spent a few hours looking at xattr and the Linux/OS X (10.4+) implementations.  Bob Ippolito's xattr module implements the OS X xattr interface on Linux, Solaris (9+) and FreeBSD.  Linux and OS X are pretty close; FreeBSD and Solaris are substantially different from either and the Solaris implementation is somewhat incomplete/broken.

    The OS X differences from Linux are:

    • Instead of l* functions, the XATTR_NOFOLLOW option

    • XATTR_NOSECURITY and XATTR_NODEFAULT are in the headers but essentially unavailable as the kernel code always returns EINVAL for them.

    • XATTR_SHOWCOMPRESSION to expose the HFS compression stuff, which I can't imagine many people needing

    • XATTR_MAXNAMELEN (but no equivalent to XATTR_SIZE_MAX).  Linux has a corresponding XATTR_NAME_MAX, which we should probably expose too.

    • XATTR_FINDERINFO_NAME and XATTR_RESOURCEFORK_NAME for some standard attribute names.  I would imagine these are worth exposing.

    I don't see any problems supporting the currently exposed Linux API on OS X  (I could probably find a usable value for XATTR_SIZE_MAX), but it's unclear if that is the right way to go forward.

    Suggestions?

    Thanks for looking into this. I think the best approach at the moment is try to wrap these differences under the LInux API. It seems the biggest one will just be adding XATTR_NOFOLLOW for the l* calls.

    koobs commented 11 years ago

    FreeBSD (at least on 7.x, 8.x and 9.x) has the following syscalls available in its API:

    extattr{get,set,list,delete}{fd,file,link}

    And also has: EXTATTR_MAXNAMELEN

    http://www.freebsd.org/cgi/man.cgi?query=extattr&sektion=2&manpath=FreeBSD+9.0-RELEASE

    koobs commented 11 years ago

    And to clarify the no-follow-symlinks case on FreeBSD:

    extattr_{get,set,list,delete}_link "system calls behave in the same way as their _file counterparts, except that they do not follow sym-links." as per the man page.

    ronaldoussoren commented 10 years ago

    The OSX API also has a "position" argument for both getting and setting extended attributes. The position should be 0 for normal attributes, and can have other values when accessing the resource fork of a file.

    ca42c254-b3b7-4c1a-a963-14d73dbf257a commented 9 years ago

    Here's an initial attempt at implementing extended attribute support. Let me know if there's any interest.

    There's currently one deficiency, which is that the namespace isn't prepended to the attribute name when calling lsxattr.

    Let me know if my approach is good, so I can continue fixing lsxattr. All other unit tests pass.

    7dd88582-93ff-402b-af92-803f3f940131 commented 8 years ago

    Is there any chance of getting this finalized? I have been using William Orr's patch as a workaround for months now, but it would be nice to not have to manually apply it each version bump...

    ned-deily commented 8 years ago

    There certainly is interest in supporting extended attributes on additional platforms. Thanks for the patch, William, and the positive comments, Billy. Since this probably falls into the category of new feature, it should be targeted for 3.6, now that 3.5 is in feature-freeze and nearing release. The gating factor is getting a core developer to review and champion it.

    ca42c254-b3b7-4c1a-a963-14d73dbf257a commented 8 years ago

    After a considerable amount of rework, I've gotten something worth submitting. All unit tests pass, and it handles some of the more unfortunate differences between FreeBSD's extended attribute syscalls and Linux's.

    One of the bigger changes is that I reworked the calls to getxattr and listxattr. These used to be called with a small buffer, and if the size of the extended attribute(s) exceeded the buffer length, we'd throw out that buffer and start again with a buffer of the maximum possible attribute size allocated.

    I threw this out, and opted for always making two calls - one to get the size of the buffer, and one to actually get the contents (or list of attributes). This works the same for both FreeBSD and Linux. FreeBSD's extattrget and extattrlist unfortunately only return the number of bytes read, *not* the number of bytes in the attribute value or the list. That means that there's no real way to determine if we've read less data than there is in the attribute.

    This passes the unit tests (on FreeBSD 10.1). I'd be interested to see results from other users and comments.

    ned-deily commented 5 years ago

    On 2018-10-02, worr asked on the python-dev mailing list:

    Can I get a review for python/cpython#46031[PR 1690]? It fixes bpo-12978 and has been sitting for a handful of years now. This adds support for os.*xattr on DragonflyBSD, FreeBSD and NetBSD.

    sorcio commented 1 year ago

    I wanted to pick this up, and port the PR to current Python and add support for macOS.

    There are some platform differences, and I need help deciding how to proceed. Some of these were discussed above, but let me summarize.

    1. Syscall interface on FreeBSD is substantially different. Previous work in the PR already takes care of this by wrapping the FreeBSD extattr_ API to make it closer to the Linux API. If this approach is accepted, no decision to be taken here.

    2. macOS interface is very similar to the one implemented by Linux. I decided to ignore the position argument (i.e. always pass 0), but it can be added later. So not much would need to change on the Python interface—but see the other issues.

    3. Linux uses ENODATA to signal an unset attribute. BSDs use ENOATTR. One issue here is that cross-platform code can't use errno in (ENODATA, ENOATTR) because on some platforms they are not both defined. I can see a few options:

      1. Leave it to users to check for the right error codes. This could break some code that uses if hasattr(os, 'getxattr') kind of guards and suddenly starts working on more platforms that are incompatible. One such example is in the stdlib (shutil).

      2. Make macOS/FreeBSD raise OSError with errno set to ENODATA to match the current behavior. On FreeBSD, this would mean exporting a fake ENODATA since it's not defined. Might introduce compatibility bugs in the future, if those systems decide to use ENODATA for something different. I'm not a fan of changing the meaning of constants, but this option would have the least compatibility impact with existing code, and is in line with the idea of reproducing the Linux API.

      3. Add a new exception class which inherits from OSError. Existing code would still work on Linux, and cross-platform code would need to migrate to catch the new exception. Probably my favorite option, if acceptable.

    4. macOS requires attribute names to be UTF-8, and returns an error (EINVAL) if it doesn't get valid UTF-8. Currently Python accepts either bytes, str, or a path-like. path_converter converts any string or path-like using the file system encoding, ~which (as far as I understand) might not be UTF-8. Probably we should not use the same converter for macOS. Should we disallow bytes on macOS? What about path-likes?~ Edit: disregard this, file system encoding is always UTF-8 on macOS (correct me if I'm wrong) so no issue here.

    5. Linux defines a XATTR_SIZE_MAX (max byte size of an attribute value) which is not defined in either FreeBSD or macOS. But both of them might enforce a file-system-specific limit. In the FreeBSD case, it's unclear to me if the limit is ever exposed to user code. macOS has a pathconf/fpathconf API to query the limit for a specific path, and has some special cases[^special]. Either way, if we want to preserve compatibility, we need to make up a value for the constant.

    [^special]: «As a special case, the resource fork can have much larger size, and some file system specific extended attributes can have smaller and preset size; for example, Finder Info is always 32 bytes», PATHCONF(2)

    1. Linux also has XATTR_NAME_MAX (number of chars in an attribute name). This is currently not exposed by Python, but probably it should. macOS has an equivalent XATTR_MAXNAMELEN. FreeBSD has EXTATTR_MAXNAMELEN, but with the proposed solution (the one implemented in the PR) the semantics would be slightly different because it only applies to the attribute name without the namespace qualifier; might be good enough.

    I will prepare an updated PR. Meanwhile, I appreciate any input on these points.