python / cpython

The Python programming language
https://www.python.org
Other
63.49k stars 30.41k forks source link

tarfile: Extracting symlink using extractfile results in Keyerror: linkname not found #122736

Open giomf opened 3 months ago

giomf commented 3 months ago

Bug report

Bug description:

I have the following tar archive containing some files and a symlink to a file that is not part of the tar archive.

drwxr-xr-x guif/users        0 2024-08-06 15:51 test/
-rw-r--r-- guif/users        0 2024-08-06 15:50 test/file3
lrwxrwxrwx guif/users        0 2024-08-06 15:51 test/mylink -> /usr/bin/env
-rw-r--r-- guif/users        0 2024-08-06 15:50 test/file1
-rw-r--r-- guif/users        0 2024-08-06 15:50 test/file2

I want to extract the symlink. With the getmember method I receive a valid member of the tar archive. Using this member in the extractfile however results in an error.

#!/usr/bin/env python

import tarfile

def main():
    with tarfile.open('./test.tar') as tar_file:
        member = tar_file.getmember('test/mylink')
        file_content = tar_file.extractfile(member)

if __name__ == '__main__':
    main()
Traceback (most recent call last):
  File "/home/guif/repos/private/tmp/./main.py", line 13, in <module>
    main()
  File "/home/guif/repos/private/tmp/./main.py", line 9, in main
    file_content = tar_file.extractfile(member)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/l014xp1qxdl6gim3zc0jv3mpxhbp346s-python3-3.12.4/lib/python3.12/tarfile.py", line 2385, in extractfile
    return self.extractfile(self._find_link_target(tarinfo))
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nix/store/l014xp1qxdl6gim3zc0jv3mpxhbp346s-python3-3.12.4/lib/python3.12/tarfile.py", line 2725, in _find_link_target
    raise KeyError("linkname %r not found" % linkname)
KeyError: "linkname 'test//usr/bin/env' not found"

As you can see 'test//usr/bin/env can not be found. The path does not look like the symlink I try to extract but more like <symlink name>/<link target>

From the doc:

Extract a member from the archive as a file object. `member' may be
a filename or a TarInfo object. If `member' is a regular file or
a link, an io.BufferedReader object is returned. For all other
existing members, None is returned. If `member' does not appear
in the archive, KeyError is raised.

CPython versions tested on:

3.10, 3.12

Operating systems tested on:

Linux

ZeroIntensity commented 3 months ago

Confirmed on the main branch. Here's an automated reproducer:

import os
import shutil
import tarfile

def main():
    if os.path.exists("./test"):
        # Remove from previous calls
        shutil.rmtree("./test")

    os.mkdir("./test")
    os.symlink("/usr/bin/env", "./test/mylink")

    with tarfile.open("./test.tar", "w") as tar:
        tar.add("./test")

    with tarfile.open('./test.tar') as tar_file:
        member = tar_file.getmember('./test/mylink')
        file_content = tar_file.extractfile(member)

if __name__ == '__main__':
    main()
theCalcaholic commented 1 month ago

The same applies to hard links, fyi