python / cpython

The Python programming language
https://www.python.org
Other
62.16k stars 29.88k forks source link

`shutil.copytree` fails due to `os.setxattr` Permission denied on lustre filesystem #121524

Open Crivella opened 2 months ago

Crivella commented 2 months ago

Bug report

Bug description:

When running shutil.copytree on a lustre filesystem with a source dir containing files with 440 permissions, os.setxattr fails with permission denied. This seems to be related to #68726. The file are copied with the original read-only permission and the copy/setting of extended attributes is failing afterwards

CPython versions tested on:

3.11

Operating systems tested on:

Linux

Zheaoli commented 2 months ago

Would you mind give us a minimal demo code to reproduce the issue?

Crivella commented 2 months ago

I do not know if this is for Lustre in general or for how it is configured on the HPC site I am running on:

Regarding the example

$ cd DIR_ON_LUSTRE_FS
$ mkdir src
$ touch src/test
$ chmod 440 src/test
$ python
> import shutil
> shutil.copytree('src', 'dst')

will result in

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/cray/pe/python/3.11.5/lib/python3.11/shutil.py", line 561, in copytree
    return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/cray/pe/python/3.11.5/lib/python3.11/shutil.py", line 515, in _copytree
    raise Error(errors)
shutil.Error: [('src/test', 'dst/test', "[Errno 13] Permission denied: 'dst/test'")]

with the files being created

$ $ ls dst/
test

To be more specific the failure in my case is happening at https://github.com/python/cpython/blob/1b0e63c81b54a937b089fe335761cba4a96c8cdf/Lib/shutil.py#L384 which i was able to workaround by adding a try/except there since i did not care about extended attributes for my usecase. I assume this would not be a general solution though

ZeroIntensity commented 1 month ago

Unable to reproduce on 3.12 with an ext4 filesystem.

Crivella commented 1 month ago

@ZeroIntensity I agree I also can't reproduce it with 3.10 3.11 or 3.12 on ext4.

As specified in the mentioned related issue #68726, i think this is related on a specific filesystem Lustre in this case, but might be a quirk on other parallel FS used in HPC.

The solution to the original issue was to make sure the copy of the extended attribute was done before changing the permission on the file. The problem here seems to be that the original copy of the file keeps the read-only permission and causes the write of the extended attribute on the destination to fail. I was able to circumvent this by adding an os.chmod(dst, 0o660) before the line specified in the previous comment, but i ended up running in other problem and did not dig any deeper

If i had to generalize i would assume this problem will happen on any FS where changing extended attributes requires the file to be writable and where copying the files keeps the original permission without having to chmod it afterwards