python-poetry / poetry

Python packaging and dependency management made easy
https://python-poetry.org
MIT License
31.53k stars 2.27k forks source link

Poetry cannot properly parse URL with Gitlab [deploy tokens] #2062

Closed xinbinhuang closed 2 years ago

xinbinhuang commented 4 years ago

Issue

Poetry cannot properly parse URL with gitlab deploy tokens. The project is hosted on a internal hosted Gitlab server.

The same git URL worked before, but I am not sure since when it is failing.

Command I ran:

poetry add "git+https://<token-name>:<token-key>@<my-org-self-hosted-gitlab-url>/<repo-path>/<repo-name>.git" -vvv

Output

[ValueError]
Invalid git url "git+https://<token-name>:<token-key>@<my-org-self-hosted-gitlab-url>/<repo-path>/<repo-name>.git"

Traceback (most recent call last):
  File "/home/binbin/.poetry/lib/poetry/_vendor/py3.7/clikit/console_application.py", line 131, in run
    status_code = command.handle(parsed_args, io)
  File "/home/binbin/.poetry/lib/poetry/_vendor/py3.7/clikit/api/command/command.py", line 120, in handle
    status_code = self._do_handle(args, io)
  File "/home/binbin/.poetry/lib/poetry/_vendor/py3.7/clikit/api/command/command.py", line 171, in _do_handle
    return getattr(handler, handler_method)(args, io, self)
  File "/home/binbin/.poetry/lib/poetry/_vendor/py3.7/cleo/commands/command.py", line 92, in wrap_handle
    return self.handle()
  File "/home/binbin/.poetry/lib/poetry/console/commands/add.py", line 89, in handle
    packages, allow_prereleases=self.option('allow-prereleases')
  File "/home/binbin/.poetry/lib/poetry/console/commands/init.py", line 294, in _determine_requirements
    requires = self._parse_requirements(requires)
  File "/home/binbin/.poetry/lib/poetry/console/commands/init.py", line 371, in _parse_requirements
    parsed = ParsedUrl.parse(requirement)
  File "/home/binbin/.poetry/lib/poetry/vcs/git.py", line 118, in parse
    raise ValueError('Invalid git url "{}"'.format(url))
xinbinhuang commented 4 years ago

Just checked that it works when poetry is <=1.0.2. Maybe there is an incompatible change introduced in 1.0.3?

finswimmer commented 4 years ago

Hello @xinbinhuang,

thanks a lot for reporting. The git url parsing has changed significant within the last release to be more consistent. As I was not aware of something like deploy tokens, the regex will probably not match anymore.

I will take a closer look at this. Could you give me please an example how <token-name>:<token-key> looks like?

fin swimmer

xinbinhuang commented 4 years ago

Hi @finswimmer ,

Thanks for the quick response.

The Gitlab's doc is here.

I tried to generate a few different tokens until hitting an error.

# This is Gitlab's format requirements: 
# Here Username == <token-name>
Username can contain only letters, digits, '_', '-', '+', and '.'

So something like this [\w-\+\.] should work for <token-name>

For the <token-key>, it is auto generated. So far, I have only seen alphanumeric (letters + digits) characters, but it may also contains the same pattern as the <token-name>.

Let me know if you would need extra information or help from me.

Thanks Bin

jedie commented 4 years ago

I am also running into this issue :( Any update or work-a-round here?

xinbinhuang commented 4 years ago

@jedie , while waiting for the bug to be fixed. You can pin your poetry version in your pyproject.toml

For example

[tool.poetry]
name = "new-package"
version = "0.1.0"
description = ""

...

[build-system]
requires = ["poetry<=0.12"]                 # this line: pin poetry to be <= 1.0.2
build-backend = "poetry.masonry.api"
abn commented 4 years ago

@xinbinhuang would making use of gitcredentials be the better apporach here? This would work both in your development and deployment scenarios. As an additional benifit, would allow you to ensure seperation of privilleges using different credentials.

As for respecting the git url spec, I agree we should ensure that a url of the form https://user:pass@example.com/repo.git is accepted. However, I would not consider using these urls when adding dependencies the best practice. This is because that would mean that the credentials will end up in plain text in both your peotry.lock and pyproject.toml files. We should also considering logging warnings once this is fixed and a password is matched/detected.

A reasonable approach might be extend the URL validation to also include a new group "password". The user should stil consider : invalid.

xinbinhuang commented 4 years ago

@abn Thanks for the suggestion, I have not used gitcredentials before.

While I think your suggestion is valid in common scenarios for user passwords, it is a different case for Gitlab Deploy Tokens. These deploy tokens are not user credentials and designed specifically by Gitlab with limited permission scopes (similar to service account). So in the context of Gitalb Deploy Tokens, I believe it's valid.

While security suggestion by warnings is good (with optional disabling it), I don't think poetry as a package management tool should by any means enforce this validation and should leave this as a choice for the users.

I haven't used Gitlab for a while now, so it doesn't affect me that much. I like poetry and hope it will be successful in the future, so I hope poetry can make good decisions along the way.

jedie commented 3 years ago

Any update here?

ansar-sa commented 3 years ago

Any update please. Pain point for gitlab users.

dagap commented 3 years ago

Same here. This is a major blocker now.

jedie commented 3 years ago

Still blocked because of https://github.com/python-poetry/poetry-core/pull/115 ?!?

carlos-lm commented 3 years ago

any solution or workaround? I have a deploy token similar to @xinbinhuang but currently poetry fails with invalid git url :grimacing:

lstolcman commented 3 years ago

@carlos-lm

Unfortunately, gitlab deploy token url does not work when one wants to use a command poetry add. We solved a problem by adding it straight to pyproject.toml and bypassing poetry add:

[tool.poetry.dependencies]
repo-name = {git = "https://gitlab+deploy-token-123:aabbddcceeff-ggh@git.example.com/repo_path/repo_name.git", tag = "1.0.0"}
# or
repo-name = {git = "https://gitlab+deploy-token-123:aabbddcceeff-ggh@git.example.com/repo_path/repo_name.git", rev = "aabbccdd"}
# or
repo-name = {git = "https://gitlab+deploy-token-123:aabbddcceeff-ggh@git.example.com/repo_path/repo_name.git", branch = "next"}


Error in our case is:


➜  poetry add git+https://gitlab+deploy-token-123:aabbddcceeff-ggh@git.example.com/repo_path/repo_name.git

  ValueError

  Invalid git url "git+https://gitlab+deploy-token-123:aabbddcceeff-ggh@git.example.com/repo_path/repo_name.git"

  at venv/lib/python3.8/site-packages/poetry/core/vcs/git.py:137 in parse
      133│                     groups.get("name"),
      134│                     groups.get("rev"),
      135│                 )
      136│
    → 137│         raise ValueError('Invalid git url "{}"'.format(url))
      138│
      139│     @property
      140│     def url(self):  # type: () -> str
      141│         return "{}{}{}{}{}".format(

➜ poetry -V
Poetry version 1.1.10

Installing using pip:

pip install git+https://gitlab+deploy-token-123:aabbddcceeff-ggh@git.example.com/repo_path/repo_name.git

works just fine

EnriqueSoria commented 3 years ago

For me it works (installing from private pypi) if I expose my token in pyproject.toml

[[tool.poetry.source]]
name = "<a_name>"
url = "https://<token_name>:<token>@gitlab.com/api/v4/projects/<project_id>/packages/pypi/simple"
jedie commented 3 years ago

I still run into the same error on gitlab CI:

grafik

Locally it works. Both used the same versions: Python 3.8, poetry v1.1.10

@EnriqueSoria This is for PyPi packages and not git source installation, isn't it?

EnriqueSoria commented 3 years ago

I still run into the same error on gitlab CI:

grafik

Locally it works. Both used the same versions: Python 3.8, poetry v1.1.10

@EnriqueSoria This is for PyPi packages and not git source installation, isn't it?

True, sorry

lstolcman commented 3 years ago

So I've looked into related PRs, and https://github.com/python-poetry/poetry-core/pull/115 resolves this issue.

One problem I encountered afterwards was that error message:

server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none

It is because our git is in private intranet, and certificate is self-signed. This was resolved by setting:

git config http.sslverify false

(source: https://forum.gitlab.com/t/server-certificate-verification-failed/7825)

jedie commented 2 years ago

Any news here?!?

a1d4r commented 2 years ago

Same issue here. poetry install fails in Gitlab CI with a package added with Gitlab Deploy Token. Locally it works fine. I use Python 3.9 and Poetry version 1.1.11 in both environments.

chenseanxy commented 2 years ago

Experiencing the same issues, and even though it's possible to get around it by adding it to pyproject.toml manually, exporting to requirements.txt still fails.

maresb commented 2 years ago

My Poetry versions are poetry=1.2.0a2, poetry-core=1.1.0a6.

In my case, I was able to use @EnriqueSoria's workaround of using GitLab's private PyPI repository, which allows me to use deploy tokens with poetry commands. Full details are as follows.

In the "source" project's repository (the project which you want to install into one or more "destination" projects), put the following verbatim into .gitlab-ci.yml. (The CI job token exists automatically, and none of the variables need to be modified. Simply copy-paste! I assume that pyproject.toml is in the repo's root.)

# A pipeline for uploading the project as a package to GitLab's private PyPI repository.
# (This allows the project to be installed from Poetry as a URL dependency with a deploy token
# which has `read_package_registry` scope.)

# References:
#  <https://docs.gitlab.com/ee/user/packages/pypi_repository/index.html#authenticate-with-a-ci-job-token>
#  <https://github.com/python-poetry/poetry/issues/2062>

build-wheel:
  when: manual
  image: python:latest
  script:
    - pip install twine
    - pip wheel --no-deps .
    - TWINE_PASSWORD=${CI_JOB_TOKEN} TWINE_USERNAME=gitlab-ci-token python -m twine upload --repository-url ${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/pypi *.whl

Now you can publish a wheel to "Packages & Registries" → "Package Registry", by going to "CI/CD" → "Pipelines" and manually start the build-wheel pipeline.

To install this wheel to some "destination" project, in your destination project's pyproject.toml you should put

[[tool.poetry.source]]
name = "identifier-for-gitlab-repository"
url = "https://gitlab+deploy-token-123456:t0k3n@gitlab.com/api/v4/projects/PROJECTID/packages/pypi/simple"

To adapt this snippet to your project, you will need the following pieces of information...

Then run the command

poetry add project-name --source=identifier-for-gitlab-repository

where project-name is the "source" project's name, as it appears in pyproject.toml under [tool.poetry] as name = "...".

If all went well, this will install the package and create an entry in pyproject.toml of the form

[tool.poetry.dependencies]
project-name = {version = "^x.y.z", source = "identifier-for-gitlab-repository"}

Using poetry-core / pip install

I'm using Docker, and to keep the container slim, instead of poetry install which requires the full Poetry installation, I use pip install which uses the much slimmer poetry-core backend as per

[build-system]
requires = ["poetry-core>=1.0.8"]
build-backend = "poetry.core.masonry.api"

The downside is that pip is not aware of the tool.poetry.source section. But it a repo can be configured with the PIP_EXTRA_INDEX_URL envvar. To keep a single source of truth, I read the URL from the pyproject.toml file using a program called dasel:

COPY --from=ghcr.io/tomwright/dasel:v1.24.1-alpine /usr/local/bin/dasel /usr/local/bin/dasel
RUN PIP_EXTRA_INDEX_URL=$( \
        dasel select -f pyproject.toml -m \
        "tool.poetry.source.(name=identifier-for-gitlab-repository).url" \
    ) \
    pip install --editable .  # (The editable flag is not essential here.)

Removing the deploy token from pyproject.toml

It's generally bad practice to keep secrets like deploy tokens in a file like pyproject.toml which is likely under version control. Poetry can manage the credentials for the repository. Simply run

poetry config repositories.identifier-for-gitlab-repository https://gitlab.com/api/v4/projects/PROJECTID/packages/pypi/simple
poetry config http-basic.identifier-for-gitlab-repository gitlab+deploy-token-123456 t0k3n

and then you can delete the gitlab+deploy-token-123456:t0k3n@ part of the url in pyproject.toml and everything will still work. (To test, delete the project-name = line from pyproject.toml and rerun the poetry add command.)

Troubleshooting

Make sure your deploy token has read_package_registry scope and not just read_package_registry.

abn commented 2 years ago

While not exactly allowing what the @xinbinhuang asked for, I reckon https://github.com/python-poetry/poetry/pull/5567 should serve this use case. Can folks needing this feature please have a go at using that PR and provide any feedback.

phihag commented 8 months ago

While not exactly allowing what the @xinbinhuang asked for, I reckon #5567 should serve this use case.

@abn No, certainly not. #5567 requires every user to log in. This may be an option for some teams, but it defeats the whole purpose of poetry if I can't just run commands to get all packages installed, but need to do some manual steps before.

What is being requested here is to have an option for private-but-not-secret repositories, where the alternative would be a git submodule or checking it into the main project. To be frank, I'm not sure what the problem is – wouldn't you only have to accept these URLs and pass them on?

github-actions[bot] commented 7 months ago

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.