python / cpython

The Python programming language
https://www.python.org
Other
62.97k stars 30.16k forks source link

shutil.copystat should (allow to) copy ownership, and other attributes #74230

Open 0aad3882-5124-4640-a5e0-b18ca5bbebe6 opened 7 years ago

0aad3882-5124-4640-a5e0-b18ca5bbebe6 commented 7 years ago
BPO 30044
Nosy @giampaolo, @tiran, @eryksun, @zooba, @RJ722, @davidmrdavid

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['easy', 'type-feature', 'library', '3.10'] title = 'shutil.copystat should (allow to) copy ownership, and other attributes' updated_at = user = 'https://bugs.python.org/noctiflore' ``` bugs.python.org fields: ```python activity = actor = 'steve.dower' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'noctiflore' dependencies = [] files = [] hgrepos = [] issue_num = 30044 keywords = ['easy'] message_count = 10.0 messages = ['291499', '319373', '373445', '373471', '373998', '373999', '374044', '384469', '384510', '388971'] nosy_count = 7.0 nosy_names = ['giampaolo.rodola', 'christian.heimes', 'eryksun', 'steve.dower', 'noctiflore', 'RJ722', 'davidmrdavid'] pr_nums = [] priority = 'normal' resolution = None stage = 'needs patch' status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue30044' versions = ['Python 3.10'] ```

0aad3882-5124-4640-a5e0-b18ca5bbebe6 commented 7 years ago

shutil.copystat() copies permissions, timestamps and even flags and xattrs (if supported), but not ownership. Furthermore, shutil.copy2() documentation until 2.7 used to say it behaves like cp -p, which preserves ownership, and not xattr nor flags. (On my system it silently fails to copy ownership when not root).

It may not be related, but comments in source code for the except NotImplementedError block concerning chmod mistakenly mentions chown-related functions.

I think copystat (and copy2) should at least provide an option to preserve ownership. I do not know if it currently preserves SELinux context and ACL, but if not, it may also allow it. It would be really useful for replication or backup applications to have a function that copies everything it can.

giampaolo commented 6 years ago

This could be achieved at least on Windows with CopyFileEx [1] and on OSX with copyfile(3) + COPYFILE_ALL which copies ACLs (but not users/groups). These are (were, in case of CopyFileEx) exposed in https://github.com/python/cpython/pull/7160/. Such a new functionality would probably deserve a separate copy3() function but it would serve OSX and Windows only. I'm not sure how things would work on other POSIX platforms.

[1] https://msdn.microsoft.com/en-us/library/windows/desktop/aa363852(v=vs.85).aspx [2] http://www.manpagez.com/man/3/copyfile/

tiran commented 4 years ago

POSIX ACLs and SELinux context information are stored in extended file attributes. The information is copied from source to destination. POSIX ACLs are stored in xattr "system.posix_acl_access" and SELinux context in xattr "security.selinux".

eryksun commented 4 years ago

In Windows, I wouldn't expect shutil.copy2 to preserve the owner and ACLs. They change whenever a file gets copied via CopyFileExW 1. Keeping them exactly as in the source file generally requires a privileged backup and restore operation, such as via BackupRead 2 and BackupWrite 3. Unless the caller has SeRestorePrivilege, the owner can only be set to one of the SIDs in the caller's groups that are flagged as SE_GROUP_OWNER, which is usually just the user's SID or, for an admin, the Administrators SID. Also, for copying the system ACL, adding or removing audit and scoped-policy-identifier entries requires SeSecurityPrivilege.

CopyFileExW copies all data streams in a file, which is typically just the anonymous data stream, but an NTFS/ReFS file can have multiple named data streams. For metadata, it copies the change and modify timestamps (but not the create and access timestamps), file attributes (readonly, hidden, system, archive, temporary, not-content-indexed), extended attributes, and resource attributes [4].

Separating this functionality into shutil.copy and shutil.copystat would be fairly involved. These functions could be left as is and just document the discrepancy in shutil.copy2, or new functions could be implemented in the nt or _winapi module to list the data streams in a file and get/set file attributes and system resource attributes. Supporting extended attributes would require the native NT API, and for little benefit since they're mostly used for "$Kernel." prefixed attributes that can only be set by kernel-mode callers such as drivers.

---

[4]: Resource attributes are like extended attributes, but a named resource attribute is a tuple of one or more items with a given data type (integer, string, or bytes) that's stored as an entry in the file's system ACL. Keeping them in the SACL allows conditional access/audit entries to reference them in an access check or access audit. Unlike audit entries in the SACL, reading and writing resource attributes doesn't require SeSecurityPrivilege.

giampaolo commented 4 years ago

Since the need to copy file ownership is common, I think there could be space for a new copy3() function which copies ownership + extended attributes (where possible). In detail:

giampaolo commented 4 years ago

Sorry, after re-reading Eryk's comment, it seems I'm not correct about CopyFileEx.

eryksun commented 4 years ago

Since the need to copy file ownership is common, I think there could be space for a new copy3() function which copies ownership + extended attributes (where possible).

FYI, Windows and POSIX have significantly different concepts about file (object) ownership. In Windows:

* Any type of SID can be set as the owner, such as a user, global
  group, local group, well-known group, domain, or logon session. All
  of these SID types, except for user SIDs, are commonly set in the
  groups of a token. Also, the token user is not limited to just users.
  It's commonly set to a well-known group such as SYSTEM, LOCAL
  SERVICE, or NETWORK SERVICE.

* The effective access token of a thread is granted owner rights to
  an object if the token user or any of the token's enabled groups is
  the owner of the object. For example, if an object is owned by the
  "BUILTIN\Users" local group, then all access tokens for standard-user
  logons will be granted owner rights as long as they have the
  "BUILTIN\Users" group enabled, which it is by default.

* If not set explicitly via "OWNER RIGHTS" (i.e. S-1-3-4), the 
  owner is implicitly granted the READ_CONTROL right to query the
  object security and the WRITE_DAC right to modify the object's
  resource attributes and discretionary access-control list. As 
  long as these rights are granted implicitly, they cannot be
  denied by deny access-control entries. However, implicit owner
  rights may be denied if an object has an implicit (by object 
  type) or explicit (by label) no-read-up or no-write-up mandatory
  policy, and the token's integrity level is less than that of the
  object.

* An explicit "OWNER RIGHTS" entry can be set in the discretionary
  access control list in order to override the implicit owner rights.
  This is not the same as setting owner rights in POSIX, since other
  ACL entries may grant or deny rights. Given the canonical priority 
  of deny access-control entries and also mandatory access control
  based on the integrity level of the object vs the token, granting
  explicit access to "OWNER RIGHTS" does not necessarily ensure the
  owner will even be granted at least the desired access in all
  contexts. Also, unlike the implicit case, if an "OWNER RIGHTS"
  entry grants READ_CONTROL and/or WRITE_DAC access, either right 
  may be denied by deny access-control entries.
7c3fd134-0d08-4ff3-a6eb-d73d3271fb59 commented 3 years ago

Hi folks, I'd be interested in contributing to this issue, since it seems "easy" enough, but I'm unsure that consensus was reached about the right solution.

From what I gathered from Giampaolo's comments, we have solutions that should work for Unix-y systems and MacOS. So that's clear.

For Windows though, I'm unsure. It appears, we'd need kernel-mode-enabled clients to do this and maybe that's enough to discourage this effort altogether.

So it appears we have 3 options then:

  1. Implement a new function that copies ownership in linux+mac but not in Windows. Then to document that difference.
  2. Implement a new function that copies ownership in linux+mac, then use the native NT API to achieve the same result in Windows. Document the limitations
  3. Just document the existing limitations.

Do we have a preference? I do not, just excited to potentially contribute something :)

Thanks!

eryksun commented 3 years ago

For Windows though, I'm unsure.

If copystat() gains the ability to copy the file's owner and group in POSIX, it is not a priority to mirror this capability in Windows, which doesn't implement anything like the Unix owner-group-other permission model. The only security metadata that I expect to be copied in Windows is the file's resource attributes, which are named attributes set on an object -- and considered inherent to the object -- for use in conditional access-control entries. Normally, the rest of the security descriptor is not copied, including the owner, group, mandatory label (integrity level and read-up/write-up/execute-up access), discretionary access-control list, and system access-control list (audit entries). That level of deep copying is a backup and restore operation of the complete file context, not a normal file copy.

Regarding copy2(), what I've seen proposed a couple of times for Windows is to implement it via CopyFileExW(). The documentation of copy2() would have to be special cased to explain that the platform copy routine is used in Windows, which copies all named (alternate) data streams and settable file attributes, extended attributes, and resource attributes. It would also need to be emphasized that copy2() is not equivalent to copyfile() + copystat() in Windows.

zooba commented 3 years ago

Just wanted to add that the sooner we can offer a wrapper around CopyFileEx on Windows (with no callbacks), the sooner we can take advantage of some significant optimisations that are being done to this function (which I can't share details of right now, but concrete work is being done).

Personally, I'm fine with it being copy2() and we take the slight behaviour change. But it should definitely be easy for users to access and use.