dependabot / dependabot-core

🤖 Dependabot's core logic for creating update PRs.
https://docs.github.com/en/code-security/dependabot
MIT License
4.7k stars 1.02k forks source link

Standard Python support #3290

Closed flying-sheep closed 2 years ago

flying-sheep commented 3 years ago

There’s exactly two standard ways of specifying dependencies, none of which dependabot supports.

  1. PEP 517’s prepare_metadata_for_build_wheel().

    The legacy way to specify Python dependencies is requirements.txt. The standard way is having a PEP 517 build system configured. That build system can optionally define a function called prepare_metadata_for_build_wheel. If that function exists, the dependencies can be obtained (from the metadata) without building a wheel, otherwise build_wheel() needs to be called.

    Code to just do the above would look like:

    import build
    deps = build.ProjectBuilder("path/to/project").get_dependencies("sdist")
    # or
    deps = build.ProjectBuilder("path/to/project").get_dependencies("wheel") - {"wheel"}
  2. PEP 631 a new standardized way to specify dependencies in a Python project (now merged into PEP 621).

    It’s supported by flit and a bunch of other new build backends. So if you want to continue your (incomplete) way of just parsing files and not calling into Python, this needs to be supported.

flying-sheep commented 2 years ago

This should be pretty straightforward (if one knows ruby, which I don’t). You already have

  1. a format preserving TOML parser (for poetry support)
  2. a parser and serializer for the PEP 508 dependency specifier language (for requirements.txt support)

So you just have to read the TOML lists project.dependencies and project.optional-dependencies.* instead of the toml tables tool.poetry.dependencies and tool.poetry.extras.*, and then use the PEP 508 handling logic instead of the poetry version specifier logic

dreinon commented 2 years ago

I just got a dependabot PR of my project using pdm! Hope this has been finally accepted :slightly_smiling_face: EDIT: False alarm, it's a dependency from github actions :slightly_frowning_face: , too good to be true haha

drewcassidy commented 2 years ago

now that setuptools supports PEP621, It would be nice to see this

drewcassidy commented 2 years ago

So you just have to read the TOML lists project.dependencies and project.optional-dependencies.* instead of the toml tables tool.poetry.dependencies and tool.poetry.extras.*, and then use the PEP 508 handling logic instead of the poetry version specifier logic

I would say to read build-system.requirements too. some of those things may be pinned too (like a project using pybind11).

ofek commented 2 years ago

I would say to read build-system.requirements too

Minor correction: build-system.requires

jorgepiloto commented 2 years ago

I am surprised after finding that dependabot was able to parse a pyproject.toml using Poetry as the build-system, see https://github.com/pyansys-testing/dependabot-pyproject/pull/3/files

jorgepiloto commented 2 years ago

Both required and optional dependencies were parsed

jorgepiloto commented 2 years ago

However, when using Flit as the build-system, dependabot did not opened any PR.

vnmabus commented 2 years ago

It seems that this issue also causes projects moving towards pyproject.toml adoption to lose their list of dependents, unless the name of the project is also repeated in setup.py.

See https://github.community/t/bug-project-not-showing-network-dependents/3161 for examples.

jenstroeger commented 2 years ago

@jorgepiloto @vnmabus exactly my problem after moving to pyproject.toml and Flit.

To be honest, I am surprised that @dependabot doesn’t support such a fundamental and standardized process, and quite concerned that this issue isn’t being addressed. We’re actually now exploring alternatives, e.g. @renovatebot (link).

jurre commented 2 years ago

Hey, thanks for the pings on this! It's definitely on our radar and we want to support this, but I can't share a timeline as to when we'll be able to ship support for this across both Dependabot and the Dependency Graph.

ofek commented 2 years ago

This will fix https://github.com/community/community/discussions/6456 too. It's really affecting many now that everyone is trying to use static metadata standards e.g.:

jorgepiloto commented 2 years ago

All the logic for parsing Python version specification is implemented in dependabot/python/file_parser/.

It looks like most of the logic is already implemented. It is just a matter of combining the "pyproject.toml" file finder and the "setup.py" requirements parser.

I wish I could be more familiar with Ruby to address this. Maybe I could try to open a PR to start pushing on this issue.

tiangolo commented 2 years ago

Just to add my two cents, I've been holding some project packaging migrations and updates waiting for this.

It would greatly benefit my projects at least: FastAPI, Typer, SQLModel, Asyncer and others. 🤓🚀

Some bugs in the past in some of these libraries would have been prevented/detected by this.

And just in case, some of these packages are used internally by GitHub, so, maybe it could help prevent internal problems too. 😅

deivid-rodriguez commented 2 years ago

Hello @flying-sheep! Thanks for this ticket, it's very informative.

I've been having a look at this. While it's clear to me what would be needed to support PEP 621 (basically what you describe at https://github.com/dependabot/dependabot-core/issues/3290#issuecomment-996694772), I'm not sure what the expectations would be for Dependabot in order to support the other standard way of specifying dependencies that you mention in this issue. Can you point at an opensource project that follows that standard, and what kind of PRs would be expected for Dependabot to create?

Thanks!

flying-sheep commented 2 years ago

Every single Python project that uses the [build-system] table to specify a build backend uses PEP 517.

I think so far, Poetry is the only build backend that doesn’t also implement PEP 621, so your example would be any project using Poetry.

Basically, if PEP 517’s hooks emit all the info you need and you’re happy to call Python code, it would be the most generic, future proof way to do things. If you rather parse static files yourself and are happy to add potential future ways build tools choose to specify dependencies in, PEP 621 + Poetry + … is the way to go for you.

deivid-rodriguez commented 2 years ago

Every single Python project that uses the [build-system] table to specify a build backend uses PEP 517.

Can you give me an example project that uses PEP 517 but does not use poetry? Unfortunately I know nothing about Python, so seeing some examples would be useful for me.

deivid-rodriguez commented 2 years ago

I know these are probably basic questions for you, but.. Do projects following PEP 517 always specify their dependencies through a pyproject.toml file, or are there any other files used to declare dependencies? Also, is there any concept of lock file in the standard, or is poetry the only tool that has this concept?

flying-sheep commented 2 years ago

Can you give me an example project that uses PEP 517 but does not use poetry?

Probably not. I think all build backends that have been created since PEP 621 exists support PEP 621, and all earlier build backends except for Poetry have added PEP 621 support in the meantime. A new build backend would need a really good reason for not supporting PEP 621.

Do projects following PEP 517 always specify their dependencies through a pyproject.toml file, or are there any other files used to declare dependencies?

Setuptools can either use setup.py, setup.cfg, or PEP 621 (pyproject.toml). Flit, Hatch, … all support PEP 621 (pyproject.toml) but also support their own config files flit.ini, hatch.toml, … and as said Poetry uses pyproject.toml but not the PEP 621 table structure.

I think most projects use their build tool’s PEP 621 support for consistency and simplicity, (except for people using setuptools since that’s been around so long). If you want a catch-all solution you need to go the PEP 517 route.

deivid-rodriguez commented 2 years ago

Probably not. I think all build backends that have been created since PEP 621 exists support PEP 621, and all earlier build backends except for Poetry have added PEP 621 support in the meantime. A new build backend would need a really good reason for not supporting PEP 621.

Alright! So that sounds like supporting PEP 621 + Poetry should be enough for this, nice!

Setuptools can either use setup.py, setup.cfg, or PEP 621 (pyproject.toml). Flit, Hatch, … all support PEP 621 (pyproject.toml) but also support their own config files flit.ini, hatch.toml, … and as said Poetry uses pyproject.toml but not the PEP 621 table structure.

That's helpful, thanks!

flying-sheep commented 2 years ago

If you want to go the 99%* route using static files instead of the 100% route using PEP 517, the correct sequence to determine which build backend is used is this:

  1. If pyproject.toml does not exist or has no build-system table or build-system.build-backend is 'setuptools.build_meta', the build backend is setuptools and depdendencies are specified in setup.py, setup.cfg, or pyproject.toml (PEP 621)
  2. If build-system.build-backend is 'poetry.core.masonry.api', dependencies are specified in pyproject.toml (poetry’s custom table)
  3. Else dependencies are probably in pyproject.toml (PEP 621)

The sequence for PEP 517 would be much simpler:

  1. If pyproject.toml exists and has a build-system table:

    1. create a Python virtual environment with the packages in build-system.requires installed
    2. run this: deps = build.ProjectBuilder("path/to/project").get_dependencies("wheel") - {"wheel"} to get the deps
  2. Else we’re dealing with legacy setuptools which you already support.

*Maybe that 99% will decline rapidly in the future. The PEP 517 route would be robustly 100%

deivid-rodriguez commented 2 years ago

Thanks!

I did conduct some experiments with the "100% route" you suggested, although I have no idea how to plug it into our current architecture. I feel more confident with the 99% route although I understand is not as bullet proof, but it works better with our current architecture.

For what it's worth, this is a small log of my experiments:

>>> deps = build.ProjectBuilder(".").get_dependencies("wheel") - {"wheel"}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'ProjectBuilder' object has no attribute 'get_dependencies'. Did you mean: 'check_dependencies'?

Turns out get_dependencies was renamed a bit over a year ago: https://github.com/pypa/build/blob/main/CHANGELOG.rst#040-2021-05-23.

Then I tried the updated code with https://github.com/encode/httpx, and I get the following:

>>> import build
>>> deps = build.ProjectBuilder("/tmp/httpx").get_requires_for_build("wheel")
>>> deps
set()

I'm not sure what to do with that, any ideas?

ofek commented 2 years ago

I think just supporting runtime deps for now would be best since it's what most of us care about and implementing it would take a day for someone competent in Ruby

deivid-rodriguez commented 2 years ago

Thanks @ofek!

Yes, for now I'm just parsing project.dependencies from pyproject.toml similarly to what we do for Poetry (I will add support for optional-dependencies later), but I was curious about the other approach @flying-sheep was mentioning. I get now that that's for build dependencies, and I agree that most users care about runtime deps, so that should get the most priority!

flying-sheep commented 2 years ago

Ah, I should have added that. The hook prepare_metadata_for_build_wheel will write a static file with runtime dependencies in it, so you can use the approach for runtime dependencies as well.

pradyunsg commented 2 years ago

I'd be fine if dependabot starts by respecting only the static dependency declarations in project.dependencies. We can extend that to support the other stuff (build, from build-system.requires and dynamic ones from the build backend, via metadata generation as well as the other hook).

I'll note that https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/#pyproject-toml is a thing, and the metadata generation step is how you'd get the "entire" set of runtime dependencies. That matches the project.dependencies for nearly every project though, so it'd be preferable to start with just statically reading things and we can extend this later. :)

deivid-rodriguez commented 2 years ago

Thanks, this is starting to make sense! @pradyunsg those docs are nice!

Which static file the hook prepare_metadata_for_build_wheel writes to?

flying-sheep commented 2 years ago

prepare_metadata_for_build_wheel('<dir>') creates (amongst 1-2 other files) the file <dir>/<package>-<version>.dist-info/METADATA, which contains Requires-Dist: ... lines (see here for format).

ofek commented 2 years ago

Just FYI, while prepare_metadata_for_build_wheel would produce results that account for dynamic metadata and therefore would be more accurate, it is an optional API method.

ofek commented 2 years ago

Hatchling might be the only backend that doesn't, for reasons, but if this is what you'll use (I hope it is) then I'll implement it as this use case is far more compelling than the others I've encountered 🙂

ofek commented 2 years ago

Hatchling now supports prepare_metadata_for_build_wheel

@deivid-rodriguez Has there been any progress on any implementation?

deivid-rodriguez commented 2 years ago

We started working on it, so yeah, there's been some progress, but I can't tell yet when it will be ready.

ofek commented 2 years ago

Awesome! Is there anything we can do to help?

jeffwidman commented 2 years ago

👋 I want to clarify expectations for everyone watching this issue.

There are two places that currently don't support pyproject.toml:

  1. Version update PRs - that version parser lives here in dependabot-core and will be fixed when this issue is resolved.
  2. Dependency Graph (DG) - this is what generates the "used by" list of dependents in the UI. That is an entirely separate service / team, and they have their own version parsing code. So unfortunately resolving this issue will not provide DG support for pyproject.toml. I do know that supporting this is on that team's radar, so hopefully they'll be able to get to it soon.

PS: @deivid-rodriguez went on a 🚴 trip this week, I'm sure he'll pick this back up when he returns.

pradyunsg commented 2 years ago

Thanks for the clarification and I'll say that I certainly appreciate the regular updates and responses here.

Do you know if it would be possible for 2 to get a public issue and potentially get tracked in GitHub's public roadmap, like https://github.com/github/roadmap/issues/467?

ofek commented 2 years ago

That shipped today fyi https://github.com/github/roadmap/issues/467#issuecomment-1259801516

davidism commented 2 years ago

If GitHub only considers project.dependencies, it will miss out on development dependencies.

Currently Flask pins all its development dependencies in different requirements/*.txt files, like tests.txt and docs.txt. I'm switching over to PDM, which allows specifying development dependency groups (which are distinct from optional dependency groups, dev groups aren't exposed in the build) in pyproject.toml. PDM also produces a pdm.lock file with every transitive dependency name and version.

pyproject.toml:

[tool.pdm.dev-dependencies]
pytest = [
    "pytest",
    "blinker",
]
coverage = [
    "pytest-cov",
    "coverage[toml]",
]
mypy = [
    "mypy",
    "pytest",
    "sqlalchemy[mypy]",
]
sphinx = [
    "sphinx",
    "pallets-sphinx-themes",
    "sphinx-issues",
    "sphinxcontrib-log-cabinet",
]
pre-commit = [
    "pre-commit",
]
tox = [
    "tox",
    "tox-pdm",
]

pdm.lock:

[[package]]
name = "alabaster"
version = "0.7.12"
summary = "A configurable sidebar-enabled Sphinx theme"

[[package]]
name = "attrs"
version = "22.1.0"
requires_python = ">=3.5"
summary = "Classes Without Boilerplate"

[[package]]
name = "babel"
version = "2.10.3"
requires_python = ">=3.6"
summary = "Internationalization utilities"
dependencies = [
    "pytz>=2015.7",
]

# continues for every library installed in the env

At least detecting the projects listed in the tool.pdm.dev-dependencies section of pyproject.toml would be nice. It's the same format, lists of project names, as the standard project.optional-dependencies. Reading pdm.lock would allow building an even more accurate dependency graph.

ofek commented 2 years ago

The solution would be a PR for every tool (that is in widespread). I plan to open one for Hatch.

davidism commented 2 years ago

@frostming not sure if you're already following this, but this will be relevant for PDM.

davidism commented 2 years ago

@jeffwidman is there a public repo for the dependency graph like this one for dependabot?

ofek commented 2 years ago

@davidism The PRs would go here, dep graph would now have multiple Actions https://github.com/github/dependency-submission-toolkit#writing-your-own-dependency-submission-action

courtneycl commented 2 years ago

👋 Hi from the dependency graph team. We don't have a public repo, but we do have this work prioritised and I can update here once it's fixed.

pradyunsg commented 2 years ago

Thanks @courtneycl for chipping in! Looking forward to the improvements in the area of GitHub + modern Python Packaging stuff.

davidism commented 2 years ago

I've left some comments on the PR that's about to get merged. https://github.com/dependabot/dependabot-core/pull/5661 I'm worried that only supporting pins in pyproject.toml will encourage users to put pins there, which will be fine for applications but really bad if libraries pin like that. Would really like to see support for lock files, even though they're not standard yet, since there's already non-standard Poetry support.

deivid-rodriguez commented 2 years ago

We've shipped preliminary support for updating dependencies following PEP621 in pyproject.toml files with #5661.

Of course there may be bugs and missing features, but I'm closing this main issue now and will track any future enhancements as separate issues.

Thanks!

zhan9san commented 2 years ago

I've left some comments on the PR that's about to get merged. https://github.com/dependabot/dependabot-core/pull/5661 I'm worried that only supporting pins in pyproject.toml will encourage users to put pins there, which will be fine for applications but really bad if libraries pin like that. Would really like to see support for lock files, even though they're not standard yet, since there's already non-standard Poetry support.

@davidism Thanks for your time Please allow me to ask a stupid question.

Why it is bad if libraries pin like that? Why it's better to support for lock files?

ofek commented 2 years ago

@zhan9san https://iscinumpy.dev/post/bound-version-constraints/

But there's no cause for concern:

[...] we not only support full pins, but also other perfectly valid ways of restricting dependencies. Just in case there's some misunderstanding there.

tiangolo commented 1 year ago

I just came back here to say THANK YOU @deivid-rodriguez and everyone involved! :hugs:

I'm now getting Dependabot upgrades in my projects with pyproject.toml including FastAPI and others :tada:

deivid-rodriguez commented 1 year ago

Thank you for the kind words @tiangolo, we're glad this is turning out useful for your projects 🎉.

annarosenthal commented 1 year ago

👋 Hi from the dependency graph team. We don't have a public repo, but we do have this work prioritised and I can update here once it's fixed.

Despite the issue being closed, still wanted to share that the changed needed from the Dependency Graph side has been shipped.