python-poetry / poetry

Python packaging and dependency management made easy
https://python-poetry.org
MIT License
31.64k stars 2.27k forks source link

Poetry gives cryptic error on add git dependency attempt #8384

Open V3RGANz opened 1 year ago

V3RGANz commented 1 year ago

Issue

I am trying to add git dependency, which is work absolutely fine with pip install git+https://github.com/Lightning-AI/lightning@master

But when I am trying to do poetry add git+https://github.com/Lightning-AI/lightning.git#master, poetry fails. adding lightning = {git = "https://github.com/Lightning-AI/lightning.git", branch = "master"} in pyproject.toml and invoking poetry update leads to the same result

poetry stack trace:

  Stack trace:

  19  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/cleo/application.py:327 in run
       325│ 
       326│             try:
     → 327│                 exit_code = self._run(io)
       328│             except BrokenPipeError:
       329│                 # If we are piped to another process, it may close early and send a

  18  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/poetry/console/application.py:190 in _run
       188│         self._load_plugins(io)
       189│ 
     → 190│         exit_code: int = super()._run(io)
       191│         return exit_code
       192│ 

  17  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/cleo/application.py:431 in _run
       429│             io.input.interactive(interactive)
       430│ 
     → 431│         exit_code = self._run_command(command, io)
       432│         self._running_command = None
       433│ 

  16  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/cleo/application.py:473 in _run_command
       471│ 
       472│         if error is not None:
     → 473│             raise error
       474│ 
       475│         return terminate_event.exit_code

  15  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/cleo/application.py:454 in _run_command
       452│ 
       453│         try:
     → 454│             self._event_dispatcher.dispatch(command_event, COMMAND)
       455│ 
       456│             if command_event.command_should_run():

  14  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/cleo/events/event_dispatcher.py:26 in dispatch
        24│ 
        25│         if listeners:
     →  26│             self._do_dispatch(listeners, event_name, event)
        27│ 
        28│         return event

  13  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/cleo/events/event_dispatcher.py:89 in _do_dispatch
        87│                 break
        88│ 
     →  89│             listener(event, event_name, self)
        90│ 
        91│     def _sort_listeners(self, event_name: str) -> None:

  12  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/poetry/console/application.py:322 in configure_installer_for_event
       320│             return
       321│ 
     → 322│         cls.configure_installer_for_command(command, event.io)
       323│ 
       324│     @staticmethod

  11  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/poetry/console/application.py:329 in configure_installer_for_command
       327│ 
       328│         poetry = command.poetry
     → 329│         installer = Installer(
       330│             io,
       331│             command.env,

  10  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/poetry/installation/installer.py:74 in __init__
        72│ 
        73│         if installed is None:
     →  74│             installed = self._get_installed()
        75│ 
        76│         self._installed_repository = installed

   9  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/poetry/installation/installer.py:443 in _get_installed
       441│ 
       442│     def _get_installed(self) -> InstalledRepository:
     → 443│         return InstalledRepository.load(self._env)
       444│ 

   8  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/poetry/repositories/installed_repository.py:274 in load
       272│                     continue
       273│ 
     → 274│                 package = cls.create_package_from_distribution(distribution, env)
       275│ 
       276│                 if with_dependencies:

   7  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/poetry/repositories/installed_repository.py:132 in create_package_from_distribution
       130│                                 source_url,
       131│                                 source_reference,
     → 132│                             ) = cls.get_package_vcs_properties_from_path(src)
       133│                             break
       134│ 

   6  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/poetry/repositories/installed_repository.py:80 in get_package_vcs_properties_from_path
        78│         from poetry.vcs.git import Git
        79│ 
     →  80│         info = Git.info(repo=src)
        81│         return "git", info.origin, info.revision
        82│ 

   5  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/poetry/vcs/git/backend.py:175 in info
       173│     @classmethod
       174│     def info(cls, repo: Repo | Path) -> GitRepoLocalInfo:
     → 175│         return GitRepoLocalInfo(repo=repo)
       176│ 
       177│     @staticmethod

   4  <string>:3 in __init__
         1│ 

   3  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/poetry/vcs/git/backend.py:147 in __post_init__
       145│         repo = Git.as_repo(repo=repo) if not isinstance(repo, Repo) else repo
       146│         self.origin = Git.get_remote_url(repo=repo, remote="origin")
     → 147│         self.revision = Git.get_revision(repo=repo)
       148│ 
       149│ 

   2  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/poetry/vcs/git/backend.py:171 in get_revision
       169│     def get_revision(repo: Repo) -> str:
       170│         with repo:
     → 171│             return repo.head().decode("utf-8")
       172│ 
       173│     @classmethod

   1  ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/dulwich/repo.py:585 in head
        583│     def head(self) -> bytes:
        584│         """Return the SHA1 pointed at by HEAD."""
     →  585│         return self.refs[b"HEAD"]
        586│ 
        587│     def _get_object(self, sha, cls):

  KeyError

  b'HEAD'

  at ~/Library/Application Support/pypoetry/venv/lib/python3.11/site-packages/dulwich/refs.py:326 in __getitem__
       322│         This method follows all symbolic references.
       323│         """
       324│         _, sha = self.follow(name)
       325│         if sha is None:
    →  326│             raise KeyError(name)
       327│         return sha
       328│ 
       329│     def set_if_equals(
       330│         self,
dimbleby commented 1 year ago

duplicate #6873, #7523 (which are however both closed, there should likely be something open tracking this)

no-one has ever provided a way to reproduce this from a clean start, please do so if you can. A dockerfile would be a good way to present that.

ryanovas commented 1 year ago

I just experienced this error myself trying to run poetry add with a git url. Seems like others in the old issues created this bug the same way. There should definitely be an open issue for this...

This was the command I ran, then I couldn't run any poetry commands until I went and deleted the src folder as specified in the old issues.

poetry add --group dev git+ssh://git@github.com:trialspark/graphene-stubs.git

dimbleby commented 1 year ago

still no-one has provided a way to reproduce

dimbleby commented 1 year ago

the error that you are reporting is completely different from what is described by this issue, please don't hijack it.

trim21 commented 1 year ago

still no-one has provided a way to reproduce

I didn't find how to reproduce, but I find cause and workaround:

rm all folders in .venv/src/ then re-run your command.

One or multiple git repos in this directory are broken.

for example:

/proj/test/.venv/src/httpx (master) # git rev-parse head
warning: ignoring dangling symref head
warning: ignoring dangling symref head
fatal: ambiguous argument 'head': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
head

So the solution would be re-clone broken git repo.

HMaker commented 1 year ago

@dimbleby I finally managed to reproduce it. As told above it happens when something is broken at venv/src.

Specifically it happens when there is an empty git repo with the same name of the package you are going to install from a git URL:

FROM ubuntu:jammy

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ENV DEBIAN_FRONTEND=noninteractive

# create virtual env and simulate activation
RUN mkdir /opt/test
WORKDIR /opt/test
RUN apt-get update && apt-get install -y --no-install-recommends git python3.10 python3.10-venv 
RUN python3 -m venv .venv
ENV VIRTUAL_ENV /opt/test/.venv
ENV PATH /opt/test/.venv/bin:$PATH

RUN pip install "poetry==1.6.1"
RUN poetry init -n
# make poetry break by creating a blank git repo at .venv/src/ for the package we are going to install
RUN mkdir -p .venv/src/requests \
    && cd .venv/src/requests \
    && git init \
    && git remote add origin "https://github.com/psf/requests.git"
# it should break with KeyError here
RUN poetry -vvv add "git+https://github.com/psf/requests.git"
dimbleby commented 1 year ago

How would that happen in a non-artificial way?

If this is the repro then the solution is: don't do that!

trim21 commented 1 year ago

How would that happen in a non-artificial way?

If this is the repro then the solution is: don't do that!

network error may cause git repo broken

trim21 commented 1 year ago

How would that happen in a non-artificial way?

If this is the repro then the solution is: don't do that!

Im guessing this happened because peotry didn't remove broken git repo in venv/src when git command exit with non-zero exit code

HMaker commented 1 year ago

How would that happen in a non-artificial way?

If this is the repro then the solution is: don't do that!

You have been asking for a dockerfile to reproduce the bug, it's there. Have you at least ran it?

If Poetry reads the venv/src folder then the state of that folder is one of the possible states of the Poetry program. It's not "artificial". What's a bug if not the program entering in an unexpected state and breaking? The filesystem is external to your program and can be written by anything, you can't assume the folder won't be in an invalid state. As @trim21 told above, if the network connection breaks while Poetry is cloning the git repo the folder will be in a broken state. Poetry should be able to handle that since it's a networked application.

The bug exist. If it should be fixed or not is another debate.

dimbleby commented 1 year ago

No-one should care about "bugs" that involve the user deliberately breaking things - sorry.

The desired repro would be eg a series of poetry commands resulting in a bad state.

Network problems is an interesting suggestion; though so far as I remember no-one in this or the linked issues has reported any such thing - my guess is that there is some more "regular" repro out there waiting to be found.

HMaker commented 1 year ago

No-one should care about "bugs" that involve the user deliberately breaking things - sorry.

You are reading from the filesystem, the folder is accessible to all programs running under the owner. This is not private process memory.

Network problems is an interesting suggestion; though so far as I remember no-one in this or the linked issues has reported any such thing - my guess is that there is some more "regular" repro out there waiting to be found.

Some people have reported network issues may cause that, see comments from related issues https://github.com/python-poetry/poetry/issues/7523#issuecomment-1725657834

Another way to reproduce the confirmed bug follows, here I kill the socket connection to github, I guess this is also artificial? lol

Save this bash script as ´testpoetry.sh` in the same folder of the following Dockerfile

#!/bin/bash

# simulate broken network connection, it will result in a blank git repo at .venv/src/requests, same as the previous reproduction
bash -c "sleep 2 && ss -K dst github.com" &
poetry -vvv add "git+https://github.com/psf/requests.git" > /dev/null 2>&1

# it should break with KeyError here
poetry -vvv add "git+https://github.com/psf/requests.git"

Build the docker image with docker build -t testpoetry . using the following Dockerfile:

FROM ubuntu:jammy

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ENV DEBIAN_FRONTEND=noninteractive

# create virtual env and simulate activation
RUN mkdir /opt/test
WORKDIR /opt/test
RUN apt-get update && apt-get install -y --no-install-recommends iproute2 git python3.10 python3.10-venv 
RUN python3 -m venv .venv
ENV VIRTUAL_ENV /opt/test/.venv
ENV PATH /opt/test/.venv/bin:$PATH

RUN pip install "poetry==1.6.1"
RUN poetry init -n

ADD testpoetry.sh .
USER root
ENTRYPOINT ["/bin/bash", "/opt/test/testpoetry.sh"]

Run the docker container with docker run --privileged testpoetry. I needs root access to kill the socket with ss.

I think you can also reproduce that by killing poetry itself after it start cloning the github repo. The bug is not directly caused by a network error, but caused by Poetry trying to use a broken cached clone. The fix requires poetry to check the integrity of the local clone to decide if it should be reused or not.

radoering commented 1 year ago

IMO, it's fair enough to suppose the git repo got broken somehow and if there is a good way to detect this state, I assume we will accept a PR that makes poetry more robust.

MrGreenTea commented 1 year ago

Me and multiple members of a team I am working with had this happen to them because of connection issues and DNS caching. It's not artificial at all, the reproduction is just a way to test the behaviour.

As the error reporting of poetry is so bare bones and often just shows a str representation of the exception it was quite challenging to find the cause and workaround.

PS: It can also happen if a git server changes identity for example.

HMaker commented 1 year ago

I think you can also reproduce that by killing poetry itself after it start cloning the github repo

Here follows a reproduction of the same bug by terminating Poetry in the middle of the installation. Don't tell me a user pressing Ctrl + C is artificial.

testpoetry.sh

#!/bin/bash

poetry -vvv add "git+https://github.com/psf/requests.git" > /dev/null 2>&1 &
echo "Poetry launched on background, terminating it after 3 secs..."
sleep 3
kill -TERM "$!" # Ctrl + C
wait "$!"
echo "Poetry terminated, it should be broken now. Next command will fail with KeyError"

poetry -vvv add "git+https://github.com/psf/requests.git"

Dockerfile:

FROM ubuntu:jammy

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ENV DEBIAN_FRONTEND=noninteractive

# create virtual env and simulate activation
RUN mkdir /opt/test
WORKDIR /opt/test
RUN apt-get update && apt-get install -y --no-install-recommends python3.10 python3.10-venv 
RUN python3 -m venv .venv
ENV VIRTUAL_ENV /opt/test/.venv
ENV PATH /opt/test/.venv/bin:$PATH

RUN pip install "poetry==1.6.1"
RUN poetry init -n

ADD testpoetry.sh .
USER root
ENTRYPOINT ["/bin/bash", "/opt/test/testpoetry.sh"]
HMaker commented 1 year ago

IMO, it's fair enough to suppose the git repo got broken somehow and if there is a good way to detect this state, I assume we will accept a PR that makes poetry more robust.

This is the end of the stacktrace where execution leaves poetry and enters dulwich (as reported by OP):

   2  .venv/lib/python3.10/site-packages/poetry/vcs/git/backend.py:171 in get_revision
       169│     def get_revision(repo: Repo) -> str:
       170│         with repo:
     → 171│             return repo.head().decode("utf-8")
       172│ 
       173│     @classmethod

   1  .venv/lib/python3.10/site-packages/dulwich/repo.py:636 in head
        634│     def head(self) -> bytes:
        635│         """Return the SHA1 pointed at by HEAD."""
     →  636│         return self.refs[b"HEAD"]
        637│ 
        638│     def _get_object(self, sha, cls):

  KeyError

  b'HEAD'

  at .venv/lib/python3.10/site-packages/dulwich/refs.py:325 in __getitem__
       321│         This method follows all symbolic references.
       322│         """
       323│         _, sha = self.follow(name)
       324│         if sha is None:
    →  325│             raise KeyError(name)
       326│         return sha
       327│ 
       328│     def set_if_equals(
       329│         self,

I think we need a way to check if the HEAD ref exist, if not we should drop the local clone. The repo might be broken in other ways, but so far all issues report this KeyError which happens when trying to access missing head ref.

dimbleby commented 1 year ago

Apparently "artificial" hit a nerve. Of course reproductions that involve network issues and (to a slightly lesser extent) Ctrl-C are more interesting than reproductions that involve self-sabotage! Thanks.

As radoering says, I expect a PR to make poetry more robust will be welcome. Even spotting this error - perhaps first asking dulwich to raise a more specific exception so that it can be more easily recognised - and printing a more helpful "here's what to do next" message would be a good start.

HMaker commented 1 year ago

It was never "self-sabotage", all the 3 reproductions I made makes Poetry break by leaving a broken git clone at venv/src/requests. In a next installation of requests from git it will try to reuse the local broken clone. The first reproduction just shows the root cause is not a network error neither a innocent Ctrl + C from the user.

But I agree, first dulwich need to handle missing head, it could raise some defined exception and document in their API. I also think the local clone integrity should be checked there.

jackklika commented 7 months ago

This has happened to a few times. I believe it happens when you install a package using poetry, and then switch it to a git repo. This is common if you are using a package, have a problem and want to contribute to fix it, create a fork to contribute, and then change the pyproject.toml to the new fork.

When I have run into this problem, a reliable fix for me has been to delete the problem package from the python virtual environment on the filesystem, and then it will be created correctly when updating poetry deps again.

The error is common enough that the poetry team should add some more descriptive text or resolve it properly.

JavierLopezT commented 6 months ago

Has happened to me today