swansonk14 / typed-argument-parser

Typed argument parser for Python
MIT License
507 stars 40 forks source link

Reproducibility info fails for git repo with no remote #99

Open ndryden opened 1 year ago

ndryden commented 1 year ago

If I have a git repository that does not have a remote, get_reproducibility_info will fail with a CalledProcessError exception. It appears that if a git repo is present, Tap assumes it has an origin remote, yet this need not be the case. I often use local git repos for initial development before pushing them somewhere.

Reproducer:

$ mkdir test
$ cd test
# Write test.py appropriately...
$ cat test.py
from tap import Tap

class ArgumentParser(Tap):
    foo: bool = False

if __name__ == '__main__':
    args = ArgumentParser().parse_args()
    print(args.get_reproducibility_info())

# Run with no git repo and it works fine:
$ python test.py           
{'command_line': 'python test.py', 'time': 'Tue Feb 28 13:03:00 2023'}

# Set up a git repo to break it:
$ git init
$ python test.py
Traceback (most recent call last):
  File "/home/ndryden/envs/test/lib/python3.9/site-packages/tap/utils.py", line 82, in get_git_url
    url = check_output(['git', 'remote', 'get-url', 'origin'], cwd=self.repo_path)
  File "/home/ndryden/envs/test/lib/python3.9/site-packages/tap/utils.py", line 45, in check_output
    output = subprocess.check_output(command, stderr=devnull, **kwargs).decode('utf-8').strip()
  File "/home/ndryden/envs/test/lib/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/home/ndryden/envs/test/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['git', 'remote', 'get-url', 'origin']' returned non-zero exit status 2.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ndryden/test/test.py", line 8, in <module>
    print(args.get_reproducibility_info())
  File "/home/ndryden/envs/test/lib/python3.9/site-packages/tap/tap.py", line 392, in get_reproducibility_info
    reproducibility['git_url'] = git_info.get_git_url(commit_hash=True)
  File "/home/ndryden/envs/test/lib/python3.9/site-packages/tap/utils.py", line 85, in get_git_url
    url = check_output(['git', 'config', '--get', 'remote.origin.url'], cwd=self.repo_path)
  File "/home/ndryden/envs/test/lib/python3.9/site-packages/tap/utils.py", line 45, in check_output
    output = subprocess.check_output(command, stderr=devnull, **kwargs).decode('utf-8').strip()
  File "/home/ndryden/envs/test/lib/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/home/ndryden/envs/test/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['git', 'config', '--get', 'remote.origin.url']' returned non-zero exit status 1.

This is because the remote origin does not exist, and git remote get-url origin returns status 2 in this case:

$ git remote get-url origin 
error: No such remote 'origin'

(Likewise, git config --get remote.origin.url returns status 1.)

You might be able to solve this by checking whether origin exists with git remote -v before executing this.

Incidentally, after adding a fake origin remote this reveals another bug, where git rev-parse HEAD fails when there are no commits, leading to a similar exception. Not sure if this deserves a separate issue (and seems like even more of an edge-case).