gitpython-developers / GitPython

GitPython is a python library used to interact with Git repositories.
http://gitpython.readthedocs.org
BSD 3-Clause "New" or "Revised" License
4.65k stars 906 forks source link

.submodules leads to AttributeError: 'Blob' object has no attribute '_name' #890

Open yarikoptic opened 5 years ago

yarikoptic commented 5 years ago
we have an obscure use case where there is a symlink to a submodule ```shell (git-annex)hopa:~/.tmp/datalad_temp_test_bf1886qhfzdeik[master]git $> git submodule bf0dbc349765f2d6c4db485c13eaa78e69e903da sub (heads/master) (dev) 1 12992 [1].....................................:Mon 08 Jul 2019 12:13:57 PM EDT:. (git-annex)hopa:~/.tmp/datalad_temp_test_bf1886qhfzdeik[master]git $> cat .gitmodules [submodule "sub"] path = sub url = ./sub datalad-id = fbc8816b-a19a-11e9-8c4f-8019340ce7f2 [submodule "down"] path = down url = ./down datalad-id = fbc8816b-a19a-11e9-8c4f-8019340ce7f2 [submodule "subdir/up"] path = subdir/up url = ./subdir/up datalad-id = fbc8816b-a19a-11e9-8c4f-8019340ce7f2 [submodule "subdir/subsubdir/upup"] path = subdir/subsubdir/upup url = ./subdir/subsubdir/upup datalad-id = fbc8816b-a19a-11e9-8c4f-8019340ce7f2 (dev) 1 12993 [1].....................................:Mon 08 Jul 2019 12:14:03 PM EDT:. (git-annex)hopa:~/.tmp/datalad_temp_test_bf1886qhfzdeik[master]git $> cat .git/config [core] repositoryformatversion = 0 filemode = true bare = false logallrefupdates = true [annex] uuid = 956b505c-b55a-4743-bafc-06c79a656027 version = 5 backends = MD5E [annex "security"] allowed-url-schemes = http https file allowed-http-addresses = all [submodule "sub"] url = /home/yoh/.tmp/datalad_temp_test_bf1886qhfzdeik/sub active = true [submodule "down"] url = /home/yoh/.tmp/datalad_temp_test_bf1886qhfzdeik/down active = true [submodule "subdir/up"] url = /home/yoh/.tmp/datalad_temp_test_bf1886qhfzdeik/subdir/up active = true [submodule "subdir/subsubdir/upup"] url = /home/yoh/.tmp/datalad_temp_test_bf1886qhfzdeik/subdir/subsubdir/upup active = true ```

and that leads to

$> python -c 'import git as g; r=g.Repo(); print(r.submodules)'           
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/yoh/deb/gits/python-git/git/repo/base.py", line 339, in submodules
    return Submodule.list_items(self)
  File "/home/yoh/deb/gits/python-git/git/util.py", line 934, in list_items
    out_list.extend(cls.iter_items(repo, *args, **kwargs))
  File "/home/yoh/deb/gits/python-git/git/objects/submodule/base.py", line 1193, in iter_items
    sm._name = n
AttributeError: 'Blob' object has no attribute '_name'

GitPython is current git master 2.1.10-43-gc73b239 . Tarball with that test repos hierarchy is available from http://www.onerussian.com/tmp/datalad_temp_test_bf1886qhfzdeik.tgz

Panaetius commented 3 years ago

What's bad about this error in particular is that the GitPython command will work in some cases, e.g. on a repo.diff('HEAD'), it raises the error here https://github.com/gitpython-developers/GitPython/blob/146202cdcbed8239651ccc62d36a8e5af3ceff8c/git/cmd.py#L113 but this is then swallowed as it's executed in a thread. So the diff command will just return [] and the error is printed to stderr instead of the diff command failing.

We're encountering this error with a .gitmodules like

[submodule "bin/miniasm"]
        path = bin/miniasm
        url = https://github.com/lh3/miniasm
[submodule "libs/miniasm"]
        path = libs/miniasm
        url = https://github.com/lh3/miniasm
[submodule "libs/JupiterPlot"]
        path = libs/JupiterPlot
        url = https://github.com/JustinChu/JupiterPlot.git
[submodule "libs/GenomeQC"]
        path = libs/GenomeQC
        url = https://github.com/HuffordLab/GenomeQC.git
[submodule "libs/SALSA"]
        path = libs/SALSA
        url = https://github.com/pintomollo/SALSA.git

and a .git/config like

[core]
        repositoryformatversion = 0
        filemode = true
        bare = false
        logallrefupdates = true
[remote "origin"]
        url = <origin url>
        fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
        remote = origin
        merge = refs/heads/master
[submodule "libs/JupiterPlot"]
        active = true
        url = https://github.com/JustinChu/JupiterPlot.git
[submodule "libs/SALSA"]
        active = true
        url = https://github.com/pintomollo/SALSA.git
[submodule "libs/miniasm"]
        active = true
        url = https://github.com/lh3/miniasm

Not sure why git only lists 3 submodules in .git/config but 4 in .gitmodules.

The stacktrace we see is

ERROR:git.cmd:Pumping 'stdout' of cmd(['git', 'diff', '-R', 'dc1f4920d1b877ad28327778b28f3a7e1500033d', '--cached', '--abbrev=40', '--full-index', '-M', '--raw', '-z', '--no-color']) failed due to: AttributeError("'Blob' object has no attribute '_name'")
Exception in thread Thread-8:
Traceback (most recent call last):
  File "/home/user/.pyenv/versions/renku-python/lib/python3.7/site-packages/git/cmd.py", line 109, in pump_stream
    handler(line)
  File "/home/user/.pyenv/versions/renku-python/lib/python3.7/site-packages/git/diff.py", line 554, in <lambda>
    handle_process_output(proc, lambda byt: cls._handle_diff_line(byt, repo, index),
  File "/home/user/.pyenv/versions/renku-python/lib/python3.7/site-packages/git/diff.py", line 543, in _handle_diff_line
    '', change_type, score)
  File "/home/user/.pyenv/versions/renku-python/lib/python3.7/site-packages/git/diff.py", line 297, in __init__
    for submodule in repo.submodules:
  File "/home/user/.pyenv/versions/renku-python/lib/python3.7/site-packages/git/repo/base.py", line 364, in submodules
    return Submodule.list_items(self)
  File "/home/user/.pyenv/versions/renku-python/lib/python3.7/site-packages/git/util.py", line 1028, in list_items
    out_list.extend(cls.iter_items(repo, *args, **kwargs))
  File "/home/user/.pyenv/versions/renku-python/lib/python3.7/site-packages/git/objects/submodule/base.py", line 1214, in iter_items
    sm._name = n
AttributeError: 'Blob' object has no attribute '_name'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/.pyenv/versions/3.7.8/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/home/user/.pyenv/versions/3.7.8/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/user/.pyenv/versions/renku-python/lib/python3.7/site-packages/git/cmd.py", line 112, in pump_stream
    raise CommandError(['<%s-pump>' % name] + remove_password_if_present(cmdline), ex) from ex
git.exc.CommandError: Cmd('<stdout-pump>') failed due to: AttributeError(''Blob' object has no attribute '_name'')
  cmdline: <stdout-pump> git diff -R dc1f4920d1b877ad28327778b28f3a7e1500033d --cached --abbrev=40 --full-index -M --raw -z --no-color
Byron commented 3 years ago

Not sure why git only lists 3 submodules in .git/config but 4 in .gitmodules.

Using GitPython's submodules should be avoided as it's not a complete implementation anymore.

With this issue in particular I would hope that the latest release improved the situation there as typing should make invalid accesses impossible. Apparently that's not the case here :/.