Open V3RGANz opened 1 year ago
duplicate #6873, #7523 (which are however both closed, there should likely be something open tracking this)
no-one has ever provided a way to reproduce this from a clean start, please do so if you can. A dockerfile would be a good way to present that.
I just experienced this error myself trying to run poetry add with a git url. Seems like others in the old issues created this bug the same way. There should definitely be an open issue for this...
This was the command I ran, then I couldn't run any poetry commands until I went and deleted the src folder as specified in the old issues.
poetry add --group dev git+ssh://git@github.com:trialspark/graphene-stubs.git
still no-one has provided a way to reproduce
the error that you are reporting is completely different from what is described by this issue, please don't hijack it.
still no-one has provided a way to reproduce
I didn't find how to reproduce, but I find cause and workaround:
rm all folders in .venv/src/
then re-run your command.
One or multiple git repos in this directory are broken.
for example:
/proj/test/.venv/src/httpx (master) # git rev-parse head
warning: ignoring dangling symref head
warning: ignoring dangling symref head
fatal: ambiguous argument 'head': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
head
So the solution would be re-clone broken git repo.
@dimbleby I finally managed to reproduce it. As told above it happens when something is broken at venv/src
.
Specifically it happens when there is an empty git repo with the same name of the package you are going to install from a git URL:
FROM ubuntu:jammy
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ENV DEBIAN_FRONTEND=noninteractive
# create virtual env and simulate activation
RUN mkdir /opt/test
WORKDIR /opt/test
RUN apt-get update && apt-get install -y --no-install-recommends git python3.10 python3.10-venv
RUN python3 -m venv .venv
ENV VIRTUAL_ENV /opt/test/.venv
ENV PATH /opt/test/.venv/bin:$PATH
RUN pip install "poetry==1.6.1"
RUN poetry init -n
# make poetry break by creating a blank git repo at .venv/src/ for the package we are going to install
RUN mkdir -p .venv/src/requests \
&& cd .venv/src/requests \
&& git init \
&& git remote add origin "https://github.com/psf/requests.git"
# it should break with KeyError here
RUN poetry -vvv add "git+https://github.com/psf/requests.git"
How would that happen in a non-artificial way?
If this is the repro then the solution is: don't do that!
How would that happen in a non-artificial way?
If this is the repro then the solution is: don't do that!
network error may cause git repo broken
How would that happen in a non-artificial way?
If this is the repro then the solution is: don't do that!
Im guessing this happened because peotry didn't remove broken git repo in venv/src
when git command exit with non-zero exit code
How would that happen in a non-artificial way?
If this is the repro then the solution is: don't do that!
You have been asking for a dockerfile to reproduce the bug, it's there. Have you at least ran it?
If Poetry reads the venv/src
folder then the state of that folder is one of the possible states of the Poetry program. It's not "artificial". What's a bug if not the program entering in an unexpected state and breaking? The filesystem is external to your program and can be written by anything, you can't assume the folder won't be in an invalid state. As @trim21 told above, if the network connection breaks while Poetry is cloning the git repo the folder will be in a broken state. Poetry should be able to handle that since it's a networked application.
The bug exist. If it should be fixed or not is another debate.
No-one should care about "bugs" that involve the user deliberately breaking things - sorry.
The desired repro would be eg a series of poetry commands resulting in a bad state.
Network problems is an interesting suggestion; though so far as I remember no-one in this or the linked issues has reported any such thing - my guess is that there is some more "regular" repro out there waiting to be found.
No-one should care about "bugs" that involve the user deliberately breaking things - sorry.
You are reading from the filesystem, the folder is accessible to all programs running under the owner. This is not private process memory.
Network problems is an interesting suggestion; though so far as I remember no-one in this or the linked issues has reported any such thing - my guess is that there is some more "regular" repro out there waiting to be found.
Some people have reported network issues may cause that, see comments from related issues https://github.com/python-poetry/poetry/issues/7523#issuecomment-1725657834
Another way to reproduce the confirmed bug follows, here I kill the socket connection to github, I guess this is also artificial? lol
Save this bash script as ´testpoetry.sh` in the same folder of the following Dockerfile
#!/bin/bash
# simulate broken network connection, it will result in a blank git repo at .venv/src/requests, same as the previous reproduction
bash -c "sleep 2 && ss -K dst github.com" &
poetry -vvv add "git+https://github.com/psf/requests.git" > /dev/null 2>&1
# it should break with KeyError here
poetry -vvv add "git+https://github.com/psf/requests.git"
Build the docker image with docker build -t testpoetry .
using the following Dockerfile:
FROM ubuntu:jammy
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ENV DEBIAN_FRONTEND=noninteractive
# create virtual env and simulate activation
RUN mkdir /opt/test
WORKDIR /opt/test
RUN apt-get update && apt-get install -y --no-install-recommends iproute2 git python3.10 python3.10-venv
RUN python3 -m venv .venv
ENV VIRTUAL_ENV /opt/test/.venv
ENV PATH /opt/test/.venv/bin:$PATH
RUN pip install "poetry==1.6.1"
RUN poetry init -n
ADD testpoetry.sh .
USER root
ENTRYPOINT ["/bin/bash", "/opt/test/testpoetry.sh"]
Run the docker container with docker run --privileged testpoetry
. I needs root access to kill the socket with ss
.
I think you can also reproduce that by killing poetry itself after it start cloning the github repo. The bug is not directly caused by a network error, but caused by Poetry trying to use a broken cached clone. The fix requires poetry to check the integrity of the local clone to decide if it should be reused or not.
IMO, it's fair enough to suppose the git repo got broken somehow and if there is a good way to detect this state, I assume we will accept a PR that makes poetry more robust.
Me and multiple members of a team I am working with had this happen to them because of connection issues and DNS caching. It's not artificial at all, the reproduction is just a way to test the behaviour.
As the error reporting of poetry is so bare bones and often just shows a str
representation of the exception it was quite challenging to find the cause and workaround.
PS: It can also happen if a git server changes identity for example.
I think you can also reproduce that by killing poetry itself after it start cloning the github repo
Here follows a reproduction of the same bug by terminating Poetry in the middle of the installation. Don't tell me a user pressing Ctrl + C
is artificial.
testpoetry.sh
#!/bin/bash
poetry -vvv add "git+https://github.com/psf/requests.git" > /dev/null 2>&1 &
echo "Poetry launched on background, terminating it after 3 secs..."
sleep 3
kill -TERM "$!" # Ctrl + C
wait "$!"
echo "Poetry terminated, it should be broken now. Next command will fail with KeyError"
poetry -vvv add "git+https://github.com/psf/requests.git"
Dockerfile:
FROM ubuntu:jammy
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ENV DEBIAN_FRONTEND=noninteractive
# create virtual env and simulate activation
RUN mkdir /opt/test
WORKDIR /opt/test
RUN apt-get update && apt-get install -y --no-install-recommends python3.10 python3.10-venv
RUN python3 -m venv .venv
ENV VIRTUAL_ENV /opt/test/.venv
ENV PATH /opt/test/.venv/bin:$PATH
RUN pip install "poetry==1.6.1"
RUN poetry init -n
ADD testpoetry.sh .
USER root
ENTRYPOINT ["/bin/bash", "/opt/test/testpoetry.sh"]
IMO, it's fair enough to suppose the git repo got broken somehow and if there is a good way to detect this state, I assume we will accept a PR that makes poetry more robust.
This is the end of the stacktrace where execution leaves poetry and enters dulwich (as reported by OP):
2 .venv/lib/python3.10/site-packages/poetry/vcs/git/backend.py:171 in get_revision
169│ def get_revision(repo: Repo) -> str:
170│ with repo:
→ 171│ return repo.head().decode("utf-8")
172│
173│ @classmethod
1 .venv/lib/python3.10/site-packages/dulwich/repo.py:636 in head
634│ def head(self) -> bytes:
635│ """Return the SHA1 pointed at by HEAD."""
→ 636│ return self.refs[b"HEAD"]
637│
638│ def _get_object(self, sha, cls):
KeyError
b'HEAD'
at .venv/lib/python3.10/site-packages/dulwich/refs.py:325 in __getitem__
321│ This method follows all symbolic references.
322│ """
323│ _, sha = self.follow(name)
324│ if sha is None:
→ 325│ raise KeyError(name)
326│ return sha
327│
328│ def set_if_equals(
329│ self,
I think we need a way to check if the HEAD ref exist, if not we should drop the local clone. The repo might be broken in other ways, but so far all issues report this KeyError
which happens when trying to access missing head ref.
Apparently "artificial" hit a nerve. Of course reproductions that involve network issues and (to a slightly lesser extent) Ctrl-C are more interesting than reproductions that involve self-sabotage! Thanks.
As radoering says, I expect a PR to make poetry more robust will be welcome. Even spotting this error - perhaps first asking dulwich to raise a more specific exception so that it can be more easily recognised - and printing a more helpful "here's what to do next" message would be a good start.
It was never "self-sabotage", all the 3 reproductions I made makes Poetry break by leaving a broken git clone at venv/src/requests
. In a next installation of requests
from git it will try to reuse the local broken clone. The first reproduction just shows the root cause is not a network error neither a innocent Ctrl + C from the user.
But I agree, first dulwich need to handle missing head, it could raise some defined exception and document in their API. I also think the local clone integrity should be checked there.
This has happened to a few times. I believe it happens when you install a package using poetry, and then switch it to a git repo. This is common if you are using a package, have a problem and want to contribute to fix it, create a fork to contribute, and then change the pyproject.toml to the new fork.
When I have run into this problem, a reliable fix for me has been to delete the problem package from the python virtual environment on the filesystem, and then it will be created correctly when updating poetry deps again.
The error is common enough that the poetry team should add some more descriptive text or resolve it properly.
Has happened to me today
-vvv
option) and have included the output below.Issue
I am trying to add git dependency, which is work absolutely fine with
pip install git+https://github.com/Lightning-AI/lightning@master
But when I am trying to do
poetry add git+https://github.com/Lightning-AI/lightning.git#master
, poetry fails. addinglightning = {git = "https://github.com/Lightning-AI/lightning.git", branch = "master"}
in pyproject.toml and invokingpoetry update
leads to the same resultpoetry stack trace: