iterative / dvc

🦉 Data Versioning and ML Experiments
https://dvc.org
Apache License 2.0
13.96k stars 1.19k forks source link

checkout: incorrectly states files are modified when they are executable #10588

Open jldorscheidt opened 1 month ago

jldorscheidt commented 1 month ago

Bug Report

checkout: incorrectly states files are modified when they are executable

Description

When tracking a file that is executable, using a shared cache with symlinks, dvc checkout keeps stating changes are applied to the file.

Reproduce

contents of dvc.yaml:

  stages:
    add_some_text:
      wdir: ../
      desc: write some text
      cmd:
        - > 
          touch data/some_new_file.txt
      deps:
        - data/data.txt
      outs:
        - data/some_new_file.txt

contents of config

[cache]
    dir = ../../cache
    type = symlink
    shared = group
['remote "temp"']
    url = ../../remote
[core]
    remote = temp

steps to reproduce:

  1. mkdir -p dvc_mwe/dvc_mwe/data dvc_mwe/cache dvc_mwe/remote
  2. touch dvc_mwe/dvc_mwe/data/data.txt && chmod +x dvc_mwe/dvc_mwe/data/data.txt
  3. cd dvc_mwe/dvc_mwe
  4. git init && dvc init
  5. Copy contents of dvc.yaml to dvc_mwe/dvc_mwe/data/dvc.yaml
  6. Copy contents of config to dvc_mwe/dvc_mwe/.dvc/config
  7. dvc add data/data.txt
  8. dvc repro data/dvc.yaml
  9. dvc push
  10. dvc checkout
  11. dvc checkout

dvc checkout will keep stating that some_new_file.txt has been modified

Expected

repeatedly calling dvc checkout without any data changes should not print any files that have been modified

Environment information

python==3.10.12 dvc==3.55.2

Output of dvc doctor:

$ dvc doctor
DVC version: 3.55.2 (pip)
-------------------------
Platform: Python 3.10.13 on Linux-6.9.3-76060903-generic-x86_64-with-glibc2.35
Subprojects:
    dvc_data = 3.16.6
    dvc_objects = 5.1.0
    dvc_render = 1.0.2
    dvc_task = 0.40.2
    scmrepo = 3.3.8
Supports:
    http (aiohttp = 3.10.10, aiohttp-retry = 2.8.3),
    https (aiohttp = 3.10.10, aiohttp-retry = 2.8.3)
Config:
    Global: /home/joostdorscheidt/.config/dvc
    System: /etc/xdg/xdg-pop/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/mapper/data-root
Caches: local
Remotes: local
Workspace directory: ext4 on /dev/mapper/data-root
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/8324f5dfe498694db375810644b6aa5a

Additional Information (if any):