spotify / luigi

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Apache License 2.0
17.71k stars 2.39k forks source link

Arbitrary file write during tarfile extraction in `luigi/contrib/lsf_runner.py` #3302

Closed Ali-Razmjoo closed 1 week ago

Ali-Razmjoo commented 3 weeks ago

Hi,

I am reporting a potential security issue with arbitrary file write during tarfile extraction in https://github.com/spotify/luigi/blob/master/luigi/contrib/lsf_runner.py#L55-L58

Extracting files from a malicious tar archive without validating that the destination file path is within the destination directory can cause files outside the destination directory to be overwritten, due to the possible presence of directory traversal elements (..) in archive paths.

Recommendation

with tarfile.open(sys.argv[1]) as tar:
    for entry in tar:
        #GOOD: Check that entry is safe
        if os.path.isabs(entry.name) or ".." in entry.name:
            raise ValueError("Illegal tar archive entry")
        tar.extract(entry, "/tmp/unpack/")

References