scikit-build / ninja-python-distributions

This project provides the infrastructure to build Ninja Python wheels.
Apache License 2.0
58 stars 15 forks source link

Avoid downloading ninja source #127

Open FRidh opened 2 years ago

FRidh commented 2 years ago

I'm trying to package this package as it is a dependency of meson-python. ninja-python-distributions tries to download a ninja source from the web. How can we avoid that and let it use a local version that?

eli-schwartz commented 2 years ago

If you know that you have a ninja installed globally as a C++ command-line utility, maybe you can just package dummy metadata for it to satisfy meson-python?

Alternatively patch meson-python to not depend on ninja if you know it's already satisfiable...

FRidh commented 2 years ago

If you know that you have a ninja installed globally as a C++ command-line utility, maybe you can just package dummy metadata for it to satisfy meson-python?

I was hoping there was a way to generate the metadata or let this package use an existing build.

Alternatively patch meson-python to not depend on ninja if you know it's already satisfiable...

That's what I did now. I doubt there will be many other packages that will use it aside from meson-python.

eli-schwartz commented 2 years ago

The meson[ninja] extra does depend on this ninja package, but as it's only an extra that doesn't really matter. In theory projects could depend on that extra, but the real reason I merged that extra into meson's own metadata is because the PR author pointed out that pipx run meson[ninja] works (and you can't do pipx run meson ninja, of course).

pipx run isn't really relevant to distro packaging, obviously.

henryiii commented 2 years ago

FYI, see https://github.com/scikit-build/scikit-build/pull/731 - A build backend can specify dynamic dependencies; scikit-build-core will probably do this do only add cmake and/or ninja if they are not present on the system and of a high enough version. meson-py could do the the same thing (but not meson, as it's part of the PEP 517 interface, and it will be important to add to [[extensions]] too, CC @ofek). FYI, this comes up because there are platforms where wheels can't be shipped, like the BSD's, Android, WebAssembly, etc., but those systems may have cmake and ninja (or just ninja for meson's case) installed already via some other package manager.

mofeing commented 1 year ago

FYI, this comes up because there are platforms where wheels can't be shipped, like the BSD's, Android, WebAssembly, etc., but those systems may have cmake and ninja (or just ninja for meson's case) installed already via some other package manager.

Any news on this? We are trying to install scipy in an offline supercomputer with POWER9 architecture and we are running into problems because of this.

henryiii commented 1 year ago

Mesonpy should list ninja using the requires for build wheel hook and only include it if it's not already installed. CC @FFY00.

https://github.com/FFY00/meson-python/blob/1a4882e9f6d973934185a0f5fec052750983b960/mesonpy/__init__.py#L965

Should have something like this:

        ninja = shutil.which("ninja")
        # (could check version here)
        if ninja is None:
            packages.append("ninja")

(Also assuming it can get ninja via shutil.which("ninja") instead of only using import ninja). And ninja then wouldn't be a hard dependency.

FRidh commented 1 year ago

Mesonpy should list ninja using the requires for build wheel hook and only include it if it's not already installed. CC @FFY00.

https://github.com/FFY00/meson-python/blob/1a4882e9f6d973934185a0f5fec052750983b960/mesonpy/__init__.py#L965

Should have something like this:

        ninja = shutil.which("ninja")
        # (could check version here)
        if ninja is None:
            packages.append("ninja")

(Also assuming it can get ninja via shutil.which("ninja") instead of only using import ninja). And ninja then wouldn't be a hard dependency.

I disagree with this approach. Listing requirements should not be impure, that is, depend on the actual build machine. These requirements are actual requirements, it is just that they can get fulfilled in different ways.

In my opinion the right approach here is to have a build system flag for this ninja package such as USE_SYSTEM_NINJA. And I also think it should be the default, since build systems fetching from the web is trouble.

If you know that you have a ninja installed globally as a C++ command-line utility, maybe you can just package dummy metadata for it to satisfy meson-python?

Exactly, that is what I think this package should provide.

henryiii commented 1 year ago

I disagree with this approach. Listing requirements should not be impure, that is, depend on the actual build machine.

The actual requirement is not a python package. It's the command line tool "ninja". That's why there's a complaint that "it's already installed". The Python package is just one way to get it.

If you know that you have a ninja installed globally as a C++ command-line utility, maybe you can just package dummy metadata for it to satisfy meson-python?

No. If someone installs this version, they should get this version. It is up to the build tool to specify what it needs (via get_requires_for_build_wheel), this package should always install exactly the version it claims to install. If pip install ninja==<this_version> triggers an SDist and that SDist just installs a dummy version that doesn't really install this version, that's a disaster.

The build tool can decide if it wants to allow configuration, it can via this dynamic hook - like USE_SYSTEM_NINJA, or tool.x.min_ninja_version, etc - but the sdist of this package should not.

henryiii commented 1 year ago

A compiled build system in python is never pure. You’re calling a compiler with ninja. We don’t pip install GCC or Clang.

Your proposed dummy package would not be pure, either; it would depend on the system installed version of ninja. Only now, instead of being controlled by the build system, which should know exactly what it requires, it’s just happening as part of an installing an SDist. Two similar systems should produce the same wheel!

Accessing the Internet during the build process (by default) is probably something we should fix. I’m also OK to have opt-in workarounds for some things if needed - though I’d say this one isn’t needed, mesonpy should declare exactly what it needs - it should check to see if there’s an installed ninja and check the version of the installed ninja to verify it is adequate; if it is, it should not add the dependency. Or it might check to see if it's on a system that has wheels, and always add it if it is. Etc.

eli-schwartz commented 1 year ago

No. If someone installs this version, they should get this version. It is up to the build tool to specify what it needs (via get_requires_for_build_wheel), this package should always install exactly the version it claims to install. If pip install ninja==<this_version> triggers an SDist and that SDist just installs a dummy version that doesn't really install this version, that's a disaster.

My apologies, I'm having an exceedingly hard time parsing this paragraph.

I proposed that @FRidh could solve distro integration concerns by performing third-party patching upon this repository and installing dummy "ninja" metadata for python via an OS package manager that uses the OS package manager metadata to record a dependency on an OS package manager version of ninja.

Can you clarify how one might pip install to trigger an SDist, whereby that SDist comes from the .dist-info subdirectory of the current environment's global site-packages? My naive understanding of python packaging was that SDist ("source dist") per definition comes from, well, sources. Not installed binary artifacts.

I would also be interested to know, in the context of "this package should always install exactly the version it claims to install", what possible errors you are afraid might occur if someone tries to pip install ninja and nothing happens because it is already recorded as installed, but users get the "wrong" OS ninja instead of the one lovingly compiled and authoritatively codesigned for authenticity under the aegis of the ninja-python-distributions project. Perhaps there are people who are concerned that the OS distribution contains shady malware and they only trust the ninja-python-distributions developer team?

henryiii commented 1 year ago

I proposed that @FRidh could solve distro integration concerns by performing third-party patching upon this repository

I think that was what the issue was about originally, though I read things like:

In my opinion the right approach here is to have a build system flag for this ninja package such as USE_SYSTEM_NINJA. And I also think it should be the default, since build systems fetching from the web is trouble.

This is suggestion that pip install ninja would revert to a dummy wrapper if ninja was already present. There are multiple problems with this. For one, this means that pip install ninja==1.10.2.4 would not install 1.10.2.4, but would simply do nothing if ninja 1.9 was already installed[^1]. Also, a valid reason to pip install ninja is that we provide a fork of ninja that adds a feature, jobserver support, that is missing from ninja you likely already have installed. You might be trying to get that, and instead you get the system ninja. But mostly it's the version issue - you should be able to specify a version and you can't specify a version if you are just taking whatever is installed on your system.

[^1]: A package can't tell if you pinned it, requested a range, or just pip install ninja'd it. "lovingly compiled and authoritatively codesigned for authenticity under the aegis" is a bit overkill, don't you think?

I'm not against a distributor providing the "ninja" package by third party patching of this repo. In fact, I can't be, as I'm not a third party. :) Though having talked with conda-forge (I'm a ninja maintainer there too, actually - maybe I'm also a third party in that case), I agree that conda install ninja should not install the ninja python package, because the 'ninja' formula doesn't and shouldn't require Python. It's better to have a python-ninja package, for example (though I can't think of a case where it's useful, see below). If there's a way for me to make that third party patching easier, I'm happy to help, though I'd expect it to be the same ninja version we provide. I'd love to have jobserver support merged or to have this package split into ninja and ninja-jobserver or something, since I can't update to 1.11 because of the fork being behind.

Distributions don't/shouldn't be using isolated builds anyway (for one, isolated builds use internet during install, for a second, they ignore distribution packaging), so this doesn't really matter for distributions, though - just don't pip install ninja.


But there's an underlying problem that's actually easily fixable with python-meson. In that case, the exact version of ninja and the job server support isn't a problem at all. In fact, if ninja's already on the system, you can save a tiny bit of time & storage by simply checking for the presence of a valid version of ninja, and only adding ninja to the get_requires_for_build_wheel hook.

I've thought about this extensively for the last couple of months, and have come to the conclusion that it's the right path for common command line tools needed for building. And later I found the completely unrelated example: https://github.com/tttapa/py-build-cmake/blob/6e583bb7f6813794abefb1ef990fb8d20e47bd8f/src/py_build_cmake/build.py#L39 , so it's not just me. Command line build requirements should be declared in the build hooks.


Can you clarify how one might pip install to trigger an SDist, whereby that SDist comes from the .dist-info subdirectory of the current environment's global site-packages? My naive understanding of python packaging was that SDist ("source dist") per definition comes from, well, sources. Not installed binary artifacts.

I'm not sure what this refers to. pip install package or pipx run build package can trigger one of two paths if only an SDist of package exists. The "normal" one is an isolated build. This will make a virtual environment, install everything in pyproject.toml: build-system.requires, call the hooks in pyproject.toml: build-system.backend (which includes installing dependencies the build system requests via get_requires_for_build_wheel or get_requires_for_build_sdist (this is how setuptools only requires wheel for building wheels), then builds the sdist or wheel inside this temporary environment. Those dependencies are usually available in wheel form. For non-universal wheels (like ninja), then the tool will download the ninja SDist and try to build it - that's what I've been discussing.

The other path is if you deactivate isolated installs, in which case, it simply tries to build in the existing environment and both the build-system.requires and the hooks are ignored. You can install or do whatever you want manually yourself before starting the build.

It should be noted, in the default "isolated" build case, adding a ninja "python package" to the host environment will not fix the problem at all, since it's building an isolated environment and using pip, not your package manager. So I'd worry you might not actually be fixing the problem I think you are trying to fix with this dummy package approach.

henryiii commented 1 year ago

Do you know what the minimum version of ninja required for meson? I can make a PR to meson-python that fixes this to show what I mean.

mofeing commented 1 year ago

@henryiii According to meson's Release Notes, from version 0.54 it requires ninja 1.7 at least.

henryiii commented 1 year ago

Thanks! Turns out it's 1.8.2, nicely listed in the Meson error message. :) (though maybe that's a more recent version - guessing this should track whatever's required in the latest meson - that should be good enough I think)

PR's here https://github.com/FFY00/meson-python/pull/175

eli-schwartz commented 1 year ago

This is suggestion that pip install ninja would revert to a dummy wrapper if ninja was already present.

Ah, I overlooked that -- but it's also nothing to do with the quote of mine that you replied to. :)

There are multiple problems with this. For one, this means that pip install ninja==1.10.2.4 would not install 1.10.2.4, but would simply do nothing if ninja 1.9 was already installed

Well, I would argue that in such a case, the dummy python metadata should be created with a version matching the ninja version that it depends on via an OS package manager.

I'm ambivalent to whether installing the "ninja" sdist should detect a system ninja and declare that version but not install it, though I do agree that it would be convenient if the sdist could be built for a distro without patching, only by passing an option. It doesn't really concern me as I'd always just pass the option, personally.

Also, a valid reason to pip install ninja is that we provide a fork of ninja that adds a feature, jobserver support, that is missing from ninja you likely already have installed. You might be trying to get that, and instead you get the system ninja.

I don't really like this, never have, and prefer that such things be done under the aegis of a forked name. Heck, samurai exists along similar lines -- it has a subset of tools, but goes in the other direction also by supporting a MAKEFLAGS-alike (SAMUFLAGS), supporting transparent forwarding of color (even when the output is not a console, but the compiler has been told to force color on) and is explicitly open to adding jobserver client support although the work hasn't actually landed.

It's valuable to people to know when they are actually running samurai, so, samurai has its own name.

Distributions don't/shouldn't be using isolated builds anyway (for one, isolated builds use internet during install, for a second, they ignore distribution packaging), so this doesn't really matter for distributions, though - just don't pip install ninja.

and

The other path is if you deactivate isolated installs, in which case, it simply tries to build in the existing environment and both the build-system.requires and the hooks are ignored. You can install or do whatever you want manually yourself before starting the build.

Isolated builds aren't actually the issue here. Running a non-isolated build with network disabled will often still verify that build-system.requires are installed, but error out instead of attempting to download them if they cannot be met. So if projects build-depend on ninja, @FRidh likely still gets errors trying to build those projects, even when using a non-isolated build. It's just that the error is "build dependency not available" rather than "error installing XXX: network unavailable".

For example, python -m build --no-isolation --skip-dependency-check is two different flags, and it's reasonable that people might want to only use the first flag, so that the build immediately fails if there are other missing build dependencies, rather than waiting until halfway through the build and possibly throwing unusual tracebacks or compile errors.

In fact, if ninja's already on the system, you can save a tiny bit of time & storage by simply checking for the presence of a valid version of ninja, and only adding ninja to the get_requires_for_build_wheel hook.

I agree that that's probably the best solution, but in the interest of discussing how to work around projects that don't do that (which may include projects using ninja-generating build systems other than both meson and cmake, as well as projects that hand-roll their own cmake/meson bindings by running subprocess.run() in custom setuptools build steps, then copy files around :frowning_face:) it might still be interesting to generate dummy metadata...

I'm not sure what this refers to.

It referred to my assumption that the only sdist which builds dummy metadata is one that exists in the working directory of a distro package builder! I was still somehow under the impression that the relevant context was just my post where I suggested that @FRidh could patch this sdist to do so, meaning that pip install cannot ever replicate that. The only thing that would publicly exist is an installed wheel.

PR's here FFY00/meson-python#175

Very nice! :+1:

henryiii commented 1 year ago

I'm ambivalent to whether installing the "ninja" sdist should detect a system ninja and declare that version but not install it,

There's no way for the ninja 1.10.2.4 SDist to detect how it was specified; it cannot tell if it was pinned with a hash, or just installed as pip install ninja, or pip wheel ninja for that matter. So it would need to error out unless exactly 1.10.2 was installed (the final number is our build number), and at the moment, the jobserver fork of 1.10.2. It should not dynamically turn itself into a different version of itself.

I'm okay to support preparing a distribution version of this (and cmake, too, see https://github.com/scikit-build/cmake-python-distributions/issues/227 - just no one has had time to work on this).

For example, python -m build --no-isolation --skip-dependency-check is two different flags, and it's reasonable that people might want to only use the first flag, so that the build immediately fails if there are other missing build dependencies, rather than waiting until halfway through the build and possibly throwing unusual tracebacks or compile errors.

It's a historical oddity that build has an opt-out flag and pip has an opt-in flag. But distros need to opt out; that's exactly why this opt-out exists for build (and why it was not possible change the default for pip, it broke distro packaging and conda-forge).

As I've said, making your "ninja" package depend on "python" simply so you can install a dummy package into "site-packages" is something no disto I'm aware of will do - most users installing ninja do not want Python as a dependency. So then you'd need a special ninja-python package in which case you could either provide the custom copy of Ninja too, or you could just add the ignore flag (which is exactly what everyone does today, including conda-forge). And, as I've pointed out, this doesn't fix isolated builds at all, so the only thing you buy is being able to run this check - which is opt-in for pip, which is what most distros use and they haven't opted-in.

FFY00 commented 1 year ago

I was going to reply to this earlier, but got a bit overwhelmed and did not manage to finish my reply.

I agree with Henry, checking if ninja is available on the system and skipping asking for that dependency at runtime is the best approach. The issue here stems from ninja being a native dependency, which cannot be properly declared in Python metadata. The ninja Python package is just a hack to be able to provide this native dependency via the Python packaging ecosystem, but we do not need the ninja package itself, just a ninja binary.

We currently have a hard dependency on the ninja package (https://github.com/FFY00/meson-python/blob/d18930f11b1a08a718905d5263eb9e32c4eb5a1d/pyproject.toml#L30), I thought this was the best way since people packaging outside the Python ecosystem can safely ignore it and depend on their native ninja package, but if that is giving issues, we can replace it with a comment noting that ninja is a native dependency and ask for it at runtime if ninja is missing.

@FRidh I'd like to hear exactly how this is affecting you as a downstream, so I can better understand your issue and try to find an alternative that works better for everyone.

band-a-prend commented 1 year ago

The same issue.

I tried to package ninja python wrapper to test SCons experimental ninja support. The main problem that after build it wants to replace system installed ninja binary under /usr/bin with provided python-wrapper file and adds own ninja binary under other path.

It's OK for pypi distributed package being installed via pip to ship ninja binary into user environment.

But for system wide installation of ninja python wtrapper it's better to rely on system ninja runtime dependency. There is no any problem to specify exact or minimal system ninja version as dependency on packaging.

henryiii commented 1 year ago

Then don't install the ninja package system-wide. pip install ninja==1.11.0. should install 1.11.0 (with the added jobserver support), not do nothing if ninja (especially not exactly equal to 1.11.0) is already installed. And you can't do that with the wheel format - there's no logic you can inject during the install process. The same is true for cmake, clang-format, and every other package of a binary that might be obtainable some other way.

The correct solution is to use PEP 517 requirements to decide if ninja is required during a build and only add it if a sufficient version is not present. scikit-build-core and meson-python do this. For things other than building other Python packages, it's harder to do, but the same basic idea should apply - only add the dependency if a system dependency is not present. And generally don't install packages with pip system wide, use pipx or virtualenvs.

ppfeister commented 1 month ago

Does this issue relate at all to the download of Kitware/ninja tarballs?

I'm working to package this for distribution (as a required dependency of another project), and it's serving as a blocker.

On build, it tries to fetch https://github.com/Kitware/ninja/archive/v1.11.1.g95dee.kitware.jobserver-1.tar.gz

If this is unrelated, I can raise a separate issue. If it is related, any suggestions from anyone who knows this project well?

eli-schwartz commented 1 month ago

I'm working to package this for distribution on Fedora (as a required dependency of another project), and it's serving as a blocker.

Can you clarify why this is a required dependency of another project? It should not be.

ppfeister commented 1 month ago

I thought it was odd myself.

The dependency chain is as follows...

spaCy -> thinc -> ml-datasets -> scipy who has test depend ninja (this project)

'required' was not the right word to use here as it's a test dependency, but as tests are almost always required to be ran on builds when they are made available by the upstream, it's required in the current context.

eli-schwartz commented 1 month ago

scipy who has test depend ninja (this project)

Incorrect, it depends on https://ninja-build.org/ which is not this project at all.

'required' was not the right word to use here as it's a test dependency, but as tests are almost always required to be ran on builds when they are made available by the upstream, it's required in the current context.

It uses it at build time, not test time?

ppfeister commented 1 month ago

Incorrect, it depends on https://ninja-build.org/ which is not this project at all.

Scipy's pyproject.toml lists ninjaas a dependency on line 88. The PyPI project for ninja links to this repository. The version number of the PyPI distribution matches this repository (1.11.1.1 to 1.11.1.1, this version doesn't exist on the linked project), and the tarball found on the PyPI distribution matches the files found in this repository. None of the files in that tarball are found in the repository for the project you linked. Countless issues and pull requests on this repository reference this PyPI distribution, including in this same thread, as well. Nothing that I've seen indicates anything other than this repository, but open to correction if I've missed some detail. It's very possible

It uses it at build time, not test time?

Scipy's pyproject.toml lists ninjaas a member of dependency group test. In either case, though, if it is at build time despite being of group test (which is technically possible), that would just make this package even more of a hard requirement

Edit: Seems that it requires ninja at BOTH test and build, via generated requires files. Still pulls from the PyPI distrib however.

eli-schwartz commented 1 month ago

Scipy's pyproject.toml lists ninjaas a member of dependency group test. In either case, though, if it is at build time despite being of group test (which is technically possible), that would just make this package even more of a hard requirement

This is... not how the python ecosystem works.

The dependency group "test" does not exist for any purpose other than to serve as a developer convenience, whereby one can install .[test] and pull in "the kitchen sink" of developer tooling. This applies broadly to all python software, but for scipy specifically, they use pyproject.toml "optional dependency groups" as machine input for generating a directory of requirements.txt files that are then discussed in their developer docs under "Building from source for SciPy development".

They do this because they want to make it extremely easy to onboard people trying to hack on the code, and installing dozens of linting tools is better than having them be confused when their PRs fail a CI linter.

You'll notice it also installs pytest-cov. I promise you, you are not required to generate coverage reports of scipy in order to run its testsuite.

ppfeister commented 1 month ago

yes, I already ceded that this project uses generated files in my edit after reviewing the pyproj. It's possible that the edit was made after you started replying, however, in which case you probably wouldn't have seen it

This is... not how the python ecosystem works.

ehm..it kinda does

using the pyproject file to generate extraneous requirements files isn't the standard and expected use of the pyproject file. that's one project wanting to do things differently. and yes, it's for developer convenience. but there's a reason why a group would be called test, just like there's a reason why a group would be called dev. Including devel depends in the test group is also against "python ecosystem" norms.

neither of the above changes the fact that this repository is in fact the dependency in question and not the project that you linked above. everything else is just bikshedding and pedantry

The question still stands for those that are reading.

eli-schwartz commented 1 month ago

using the pyproject file to generate extraneous requirements files isn't the standard and expected use of the pyproject file. that's one project wanting to do things differently. and yes, it's for developer convenience. but there's a reason why a group would be called test, just like there's a reason why a group would be called dev. Including devel depends in the test group is also against "python ecosystem" norms.

No it quite literally is the ecosystem norm for the "test" optional dependency group to contain items extraneous to requirements, which exist solely to ensure the convenience of project developers. Generating extraneous requirements files isn't necessarily the standard, but having completely unnecessary items in pip install -e ".[test]" because it is a developer shorthand for initializing a comfortable developer workflow is very standard.

The python ecosystem has never had a real standard for running tests at all. You cannot even find out what command to run to execute a testsuite without consulting documentation and/or checking the contents of tox/nox/github-workflows configuration files. There is certainly no standard for what the dependencies are. And people frequently install linter plugins in CI, and enable them by default in test harness config files (mainly pytest) so you have to sed those out in order to reduce the fragility of testsuite dependencies and stop checking whether the codebase is compliant with the latest of many black standards, or whether the bugfix patch you applied maintains consistent 100% branch coverage even though you don't aggregate branch coverage for 9 different python versions and 5 different operating systems and therefore it's meaningless anyway.

neither of the above changes the fact that this repository is in fact the dependency in question and not the project that you linked above. everything else is just bikshedding and pedantry

The question still stands for those that are reading.

I am saying you do not need the dependency in question. It is not bikeshedding, nor pedantry.

In fact, the testsuite doesn't even run ninja at all AFAIK -- it is only installed by pip install -e ".[test]" because the scipy developers expect that people running the testsuite will frequently iterate and every time they do, scipy needs to rerun ninja to ensure that all the C/C++/Fortran code is up to date. And with an editable install you are NOT guaranteed to have ninja installed, even if it was installed at the time of compiling the wheel, because it might have been detected as unavailable and pip MIGHT have fallen back to downloading a copy of ninja from the internet (via PyPI), and if that happens then build isolation will result in ninja being deleted afterwards, which is confusing to contributors and therefore it is best for contributors to simply have python-ninja installed just in case.

ppfeister commented 1 month ago

I am saying you do not need the dependency in question.

Maybe not to build outright, but in order to meet standards for certain packaging guidelines, you need to at least make an effort to have them available. You could just as easily skip every single not absolutely-required depend and have a successful build, but now, dependents that rely on these optionals can be unexpectedly encumbered or broken.

Your notes about the C++ build system as a source will be investigated, as I'm much more closely following what you're saying now. That is not what it was read as originally, rather, as correcting that the PyPI depend wasn't this repository.

On everything else, I think a lot of it is just us getting wires crossed, because I'm with you otherwise, particularly after the added detail. A very long few days over here, about to call it for the night, so it's probably on my end tbh

eli-schwartz commented 1 month ago

scipy uses, as its build system:

This is listed in pyproject.toml as:

build-backend = 'mesonpy'
requires = [
    "meson-python>=0.15.0",

meson-python in turn depends on (I'm an upstream maintainer of this):

meson depends on:

This project is a PyPI repackaging of https://packages.debian.org/sid/ninja-build for people that cannot apt install ninja-build and have to pip install ninja. That is all.

There's a lot of software you can get from pip.

There are many things one can get from PyPI or from dpkg, but as a distro packager you only want to depend on the version from the distro package manager.

Maybe not to build outright, but in order to meet standards for certain packaging guidelines, you need to at least make an effort to have them available. You could just as easily skip every single not absolutely-required depend and have a successful build, but now, dependents that rely on these optionals can be unexpectedly encumbered or broken.

I, too, am a distro package maintainer. I have been down this road before and decided that it is not worth trying to get dependencies from PyPI when they are the same dependencies already available in the distro. The depends are there either way, so this isn't about standards.

ninja is available (and is what scipy needs), regardless of whether pip show ninja reports a pip package manager entry for ninja.

If pyproject.toml files list the PyPI version as a mandatory dependency anyway, then it's better to simply patch the software to use the system copy. Every distro has rules about bundled code copies and this is pretty similar.

The PyPI redistribution still provides functionality for Windows and macOS users, as well as Linux users of workstations who don't have root access and need to quickly build their data science project without asking management for packages to be installed. We don't need that for ourselves.

rgommers commented 1 month ago

This project is a PyPI repackaging of https://packages.debian.org/sid/ninja-build for people that cannot apt install ninja-build and have to pip install ninja. That is all.

Yes, this is the essence of it. No distro packager should repackage this repo ever - you already have a ninja package in your distro, use that instead.

Fedora has some tooling to parse dependencies from pyproject.toml. That tooling probably needs a fix so that if it sees ninja, it should translate it to the Fedora package name for ninja (something like python3-ninja -> ninja).


Some context for the SciPy question: we recently added Cython, meson and ninja as .[test] dependencies (see scipy#20422) because in the latest release (1.14.0) we added tests that build new extension modules for the first time. The tests are skipped if cython isn't found, but if it is found then meson/ninja are also required. And since we cannot express a dependency on "system ninja" until PEP 725 lands, we therefore added "ninja from PyPI" as an optional test dependency.

Hence, if you're adding a SciPy build recipe to a distro and you want to express test dependencies, depend on the distro's ninja package there.

ppfeister commented 1 month ago

Having revisited this a day later, I'm with y'all. mb. What you're saying about the ninja vs ninja situation is 100% correct on all counts

After a fairly long week I needed a bit of a reset, I guess. Things weren't adding up.

Probably going to take the weekend off and go fishing, come back fresh

LecrisUT commented 1 month ago

I have been thinking about this lately w.r.t. building and packaging a pure CMake project. I will come around to how this relates to ninja at the end, but first consider this setup.

These packages work in such a way that if you should be able to use it for the following dependent projects:

The key point here is having a python3-cmake-extra-utils package with the python metadata so that it can use build-system.requires, but redirecting the files as needed to the real cmake-extra-utils package files.


So how does this fit in with ninja or cmake. Basically it is possible to create python3-ninja and python3-cmake wrapper packages within Fedora, patching up the implementation to make sure the system files are used. More preferred is to not include the dependencies within build-system.requires, but there are various implementations that use their own setup.py implementation and it would be more easy for the packagers to have a way to support that. To achieve this, I basically instruct scikit-build-core to not run the cmake at all, and just install the (patched) python files:

%build
# Do not actually build CMake
%{pyproject_wheel -C wheel.cmake=false}

In order to support version parity there are a few points here:

This may not apply for all packaging environments, but in the case like RPM-based environments that have tight integration with python (e.g. through %pyproject_buildrequires which automates tracking of dependencies), we could investigate more about how to support and when to not support any of these designs. This would also be informative towards new projects for how can they package for both system packages and PyPI to best satisfy both environments' packaging guidelines.

eli-schwartz commented 1 month ago

There's an additional complication for ninja, that @henryiii has previously raised as a concern. The pypi package contains different patched functionality not present in the official ninja releases, and it's argued that projects depending on the pypi "ninja" package should be able to assume those patches are there as it's part of the pypi package's documented functionality.

The argument is that projects which don't care which ninja they get should check for the C++ program and projects which do care, shouldn't be tricked into thinking they have the patched edition when they don't -- so, packaging "just metadata" would then trick them.

LecrisUT commented 1 month ago

There's an additional complication for ninja, that @henryiii has previously raised as a concern. The pypi package contains different patched functionality not present in the official ninja releases, and it's argued that projects depending on the pypi "ninja" package should be able to assume those patches are there as it's part of the pypi package's documented functionality.

Fair point, could you point to some of those so I can check the nature of those patches. I couldn't see anything in the current master branch using FetchContent_Declare.

In this case I would say it is an issue for the packagers, because similarly they would also have patched versions on their side for relevant CVEs and more. If we limit the usage of python3-ninja for package building purposes only, i.e. the user would not be using python3 -m venv --system-site-packages, then packagers are already responsible for making sure a project foo is compatible with whatever state the distro is in.

eli-schwartz commented 1 month ago

Fair point, could you point to some of those so I can check the nature of those patches. I couldn't see anything in the current master branch using FetchContent_Declare.

It's right there? https://github.com/scikit-build/ninja-python-distributions/blob/6694535d0d435e699189c8cd3c01b51d51212747/CMakeLists.txt#L50

It downloads a fork, not a series of patch files.

In this case I would say it is an issue for the packagers, because similarly they would also have patched versions on their side for relevant CVEs and more.

Patching a CVE isn't a new feature. People don't usually worry about bug-for-bug compatibility.

henryiii commented 1 month ago

The missing jobserver functionality looks like it's getting added to Ninja in https://github.com/ninja-build/ninja/pull/2474, but currently for Unix only. So the fork likely will live on for a while longer until Windows support is also added.

I feel like the whole point of this (to make adding PyPI wrappers to the the build-system.requires pass metadata checks for third-party builds) is wrong, though? Third party packagers should not package another third party packager's package, but should package the original instead, right? That's like Fedora repackaging a Debian package or conda providing a homebrew recipe.

Ninja is a bit of special case due to the fact it's using a fork, but a) if you want to provide the fork, you still don't need to provide the Python files, and b) the fork should eventually go away once the jobserver support is added and supports Windows.

jcfr commented 1 month ago

So the fork likely will live on for a while longer until Windows support is also added

I confirm that we will keep the fork until job server support is integrated upstream for both Linux and Windows.

LecrisUT commented 1 month ago

The missing jobserver functionality looks like it's getting added to Ninja in https://github.com/ninja-build/ninja/pull/2474, but currently for Unix only. So the fork likely will live on for a while longer until Windows support is also added.

Hmm, iiuc the python files do not make use of this feature right? In which case, even if we patch the dependencies the build itself would not succeed.

I feel like the whole point of this (to make adding PyPI wrappers to the the build-system.requires pass metadata checks for third-party builds) is wrong, though?

Wait, maybe I've mispoke at some point. It's not about making pip install pass metadata checks, Fedora already uses --no-build-isolation so that check is not performed. Instead it's about %pyproject_buildrequires which reads the metadata file for build-system.requires and other optional dependencies its' requested and then adds the dependencies to the .spec file as BuildRequires: python3dist(foo), from which Fedora searches which package provides that, it can be python3-foo or even a completely different package baz. It is determined by which package contains foo.dist-info metadata folder. This is a tool to simplify packaging maintenance, and it's a similar case for rust, go, etc. For this case the goal is to point it to ninja-build package (directly or indirectly).

Third party packagers should not package another third party packager's package, but should package the original instead, right? That's like Fedora repackaging a Debian package or conda providing a homebrew recipe.

Good analogy, I've also didn't check that it points to a kitware fork instead. But in this case it's slightly different, because this part is about packaging the python files. It would be more related to 2 different aspects: debundling dependencies, like how a bundled libxz source is stripped out and replaced with linkage to the system paclage, and aligning with other package managers, like if an optional dependency is available we try to make it available as well, or if Fortran bindings are built on conda we would try to provide those as well. The Fedora python packaging guidelines encourages alignment with PyPI environment as much as possible.

eli-schwartz commented 1 month ago

Wait, maybe I've mispoke at some point. It's not about making pip install pass metadata checks, Fedora already uses --no-build-isolation so that check is not performed. Instead it's about %pyproject_buildrequires which reads the metadata file for build-system.requires and other optional dependencies its' requested and then adds the dependencies to the .spec file as BuildRequires: python3dist(foo), from which Fedora searches which package provides that, it can be python3-foo or even a completely different package baz. It is determined by which package contains foo.dist-info metadata folder. This is a tool to simplify packaging maintenance, and it's a similar case for rust, go, etc. For this case the goal is to point it to ninja-build package (directly or indirectly).

It is an unfortunate fact of life that the Python developers ecosystem is pretty bad about truthfully describing dependencies. The best-known example of this is any project that uses python-poetry: https://iscinumpy.dev/post/bound-version-constraints/ although there are lots of other examples.

It is generally wise to have a mechanism for overriding automatic parsing of dependencies for cases where the dependency is wrong, such as here. The simplest approach, maybe, is to apply a patch to their pyproject.toml deleting that line. :)

LecrisUT commented 1 month ago

It is an unfortunate fact of life that the Python developers ecosystem is pretty bad about truthfully describing dependencies

Yeah, but on the other side of the spectrum with golang it's a nightmare for distro pckagers.

The best-known example of this is any project that uses python-poetry

Ooph yeah, whenever I see poetry I run.

It is generally wise to have a mechanism for overriding automatic parsing of dependencies for cases where the dependency is wrong, such as here. The simplest approach, maybe, is to apply a patch to their pyproject.toml deleting that line. :)

Indeed, and it can be opted out, and there are a few clever ways for patching it without maintaining a manual patch file using tomcli. But there are 2 other workflows that are a bit harder to deal with. copr supports building a python package spec file directly from PyPI, which can greatly simplify handling chains of packages. The other workflow would be for new packagers where we want to lower the barier of entry as much as possible. From my own experience I can say this abstraction has helped a lot to get a gradual foothold into the build systems, until I was able to grasp each part of the build process and packaging.

eli-schwartz commented 1 month ago

Yeah, but on the other side of the spectrum with golang it's a nightmare for distro pckagers.

That's not the other side of the spectrum, it's the same side. ;) "We only support vendored dependencies from a lockfile downloaded at build time" vs "we only support virtualenvs from a lockfile downloaded at build time".