python / cpython

The Python programming language
https://www.python.org
Other
62.39k stars 29.96k forks source link

filecmp.cmpfiles w/ absolute path names #90670

Open a66de534-6da8-4d25-b66c-95dc23ca3d9d opened 2 years ago

a66de534-6da8-4d25-b66c-95dc23ca3d9d commented 2 years ago
BPO 46512
Nosy @terryjreedy, @bersbersbers

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-bug', 'library', '3.9'] title = 'filecmp.cmpfiles w/ absolute path names' updated_at = user = 'https://github.com/bersbersbers' ``` bugs.python.org fields: ```python activity = actor = 'bers' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'bers' dependencies = [] files = [] hgrepos = [] issue_num = 46512 keywords = [] message_count = 3.0 messages = ['411570', '412043', '412058'] nosy_count = 2.0 nosy_names = ['terry.reedy', 'bers'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue46512' versions = ['Python 3.9'] ```

a66de534-6da8-4d25-b66c-95dc23ca3d9d commented 2 years ago

It is very easy to use filecmp.cmpfiles incorrectly by passing absolute path names. This is because

  1. the documentations does not say that relative path names have to be passed, and
  2. filecmp.cmpfiles does not issue a warning when absolute path names are passed.

Consider this example code, which does look sensible at first glance:

    files = dir_a.glob("*")
    (equal, _, _) = filecmp.cmpfiles(dir_a, dir_b, files, shallow=False)
    print("equal:", *equal)

However, in the full example below, you will see that this code fails to detect that two files are actually different.

"""Demo behavior of filecmp.cmpfiles with absolute path names.""" import filecmp import tempfile from pathlib import Path

with tempfile.TemporaryDirectory() as tmpdirname:
    # prepare two different files
    tmpdir = Path(tmpdirname)
    dir_a = tmpdir / "a"
    dir_b = tmpdir / "b"
    file_a = dir_a / "foo.txt"
    file_b = dir_b / "foo.txt"

    dir_a.mkdir()
    dir_b.mkdir()
    file_a.write_text("A")
    file_b.write_text("B")

    # actually diff the files
    files = dir_a.glob("*")
    # filecmp should issue a warning here!
    (equal, _, _) = filecmp.cmpfiles(dir_a, dir_b, files, shallow=False)
    # otherwise, this result is easy to misinterpret - files are reported as equal
    print("equal:", *equal)
terryjreedy commented 2 years ago

https://docs.python.org/3/library/filecmp.html#filecmp.cmpfiles I consider not working for absolute path names to be a bug. Did your example work with relative paths?

a66de534-6da8-4d25-b66c-95dc23ca3d9d commented 2 years ago

Did your example work with relative paths?

Yes, it does. Just append the following to my example code:

    # actually diff the files - correctly!
    files = [f.relative_to(dir_a) for f in dir_a.glob("*")]
    (_, different, _) = filecmp.cmpfiles(dir_a, dir_b, files, shallow=False)
    print("different:", *different)

Output then is

equal: C:\Users\bers\AppData\Local\Temp\tmp1p6jh4rg\a\foo.txt different: foo.txt