iterative / dvc

🦉 ML Experiments and Data Management with Git
https://dvc.org
Apache License 2.0
13.66k stars 1.17k forks source link

exp run: pygit2 in script does not find active branch and last commit #10411

Closed JenniferHem closed 5 months ago

JenniferHem commented 5 months ago

Bug Report

Description

I have a custom script to compare loads of differences between HEAD and the last commit. To to this I am using pygit2 to get the repo, branch and then the last commit. This works as expected when I use dvc repro, however fails with dvc exp run.

Reproduce

1) create new git repo and use dvc init 2) create python script as test.py:

from pygit2 import Repository
repo = Repository('.')
print(repo)
active_branch = repo.head.shorthand
print(active_branch)
last_commit = str(repo.branches.get(active_branch).target)
print(last_commit)

3) commit the script 4) create dvc yaml:

stages:
  runTest:
    cmd: python test.py

5) run dvc repro --> there is no error

Running stage 'runTest':                                              
> python test.py
pygit2.Repository('/home/user/Code/playgrounds/dvcbug/.git/')
master
4100f0854a617a61c77e7626aa48f80f88fdbba1
Use `dvc push` to send your updates to remote storage. 

6) run dvc exp run --> the script fails with AttributeError: 'NoneType' object has no attribute 'target'

Reproducing experiment 'magic-tape'                                   
Building workspace index                                                                                                                                                                           |0.00 [00:00,    ?entry/s]
Comparing indexes                                                                                                                                                                                 |1.00 [00:00, 1.42kentry/s]
Applying changes                                                                                                                                                                                   |0.00 [00:00,     ?file/s]
Running stage 'runTest':                                              
> python test.py
pygit2.Repository('/home/user/Code/playgrounds/dvcbug/.git/')
HEAD
Traceback (most recent call last):
  File "/home/user/Code/playgrounds/dvcbug/test.py", line 7, in <module>
    last_commit = str(repo.branches.get(active_branch).target)
AttributeError: 'NoneType' object has no attribute 'target'
ERROR: failed to reproduce 'runTest': failed to run: python test.py, exited with 1

Expected

I would expect that also with dvc exp run I am able to get the full information about git in the current repository (i.e. Branch, comits etc)

Environment information

WSL2 with Ubuntu 22.04 LTS git version 2.34.1 VSCode (but happens in Terminal and VSCode Terminal)

Output of dvc doctor:

$ dvc doctor
VC version: 3.50.1 (pip)
-------------------------
Platform: Python 3.10.13 on Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
Subprojects:
        dvc_data = 3.15.1
        dvc_objects = 5.0.0
        dvc_render = 1.0.1
        dvc_task = 0.3.0
        scmrepo = 3.3.2
Supports:
        http (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
        webdav (webdav4 = 0.9.8),
        webdavs (webdav4 = 0.9.8)
Config:
        Global: /home/hemmerj3/.config/dvc
        System: /etc/xdg/dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: webdavs
Workspace directory: ext4 on /dev/sdb
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/f2844fa86f977ea4a68a542d14da2052
skshetry commented 5 months ago

That's expected because exp run runs in a detached head, so there is no active branch.

You can try getting last commit using:

import pygit2

repo = pygit2.Repository(".")
print(repo.head.raw_target)
JenniferHem commented 5 months ago

Thanks and sorry for that. Is there any detailed Information where I can read up on that to understand DVC better?