Open LecrisUT opened 2 months ago
I am aware of the scikit-build-core
project and long term I will want to migrate to it. However I wanted to wait until it supports all the features required to migrate https://github.com/rapidfuzz/RapidFuzz.
I did have a look at this last year when development on scikit-build-core
started. I think the only required feature that's still missing is https://github.com/scikit-build/scikit-build-core/issues/112 which is required for the pure Python fallback.
Thanks for clarifying the blocking feature. I am not sure I understand what is meant with run_setup(true/false)
. Is it to disable the cmake build part? Can you walk me through how you use that?
Have you also checked the Overrides
feature? Might be close to what you need?
I can also share my project for design reference. It's basically split in CMake subprojects so that it can build as a monolith, or individual components, like the python interface only.
https://github.com/rapidfuzz/RapidFuzz/blob/main/setup.py is the setup script
This behaves in the following way:
1) when packaging it should always build the C-Extension and fail hard if it fails to build since that's either a bug or an error in the build environment. This is done by detecting the build environment based on some environment variables. As mentioned in the scikit-build-core
issue this part could be achieved using overrides
2) when building outside of a packaging environment it should attempt to build the C-Extension but fall back to a pure Python implementation if the installation fails and print a warning. This doesn't affect most users, since I do provide wheels for most platform
I don't believe 2) can be implemented using scikit-build-core
so far.
In addition one pre-requirement for 2) is that cmake and ninja are only installed using pip if:
I currently implement this pre-requirement in https://github.com/rapidfuzz/RapidFuzz/blob/main/_custom_build/backend.py. Not sure whether this is already supported by scikit-build-core
. I know @henryiii did add at least some detection logic for available wheels at some point as well: https://github.com/scikit-build/scikit-build-core/blob/main/src/scikit_build_core/resources/known_wheels.toml.
I don't believe 2) can be implemented using scikit-build-core so far.
Couldn't this be controlled within CMake, e.g. adding an advanced option ALLOW_FALLBACK
(normally I insist projects to add namespaced options, but for pip installed projects only I concur). This can then be controlled via overrides, although if it's strictly limited to PyPI packaging, I would just add appropriate --config-settings
option to pip
. One great point of scikit-build-core
design is that it is very close to native CMake and Python so anything can be designed with enough CMake grease.
In addition one pre-requirement for 2) is that cmake and ninja are only installed using pip if:
- there is no sufficient version installed on the system
- there is a wheel available
The default is to try the system first and then fallback to pip. Version checking is implemented. Changing the preference order would be tricky for packagers.
The default is to try the system first and then fallback to pip. Version checking is implemented. Changing the preference order would be tricky for packagers.
Yes but this breaks in the following case:
In this case I still want the fallback version to be installed. This pretty much rules out any solutions that are implemented in cmake.
Couldn't this be controlled within CMake, e.g. adding an advanced option ALLOW_FALLBACK (normally I insist projects to add namespaced options, but for pip installed projects only I concur).
Maybe that would be possible. I don't know how I can allow python_add_library
to fail without an error though.
although if it's strictly limited to PyPI packaging, I would just add appropriate --config-settings option to pip
I am not familiar with this option. So not sure how it would help me in my particular case.
In this case I still want the fallback version to be installed
Aaah, now I understood. Wow that's a tricky set of conditions to support. Kudos for the extra consideration put. In that case I am truly lost. How are the pure-python files supposed to work in that case, maybe there is a design using multi-python project like hatch
+ hatchling
have in their repository. I was considering adding that for my project as well, but didn't get around to it, so I don't have clear design to recommend.
how I can allow
python_add_library
to fail without an error though.
Why would it fail? This is defined in FindPython
and it's just a wrapper of add_library
. If you have an example of that failure can you share, and I'll try to forward it to upstream CMake?
although if it's strictly limited to PyPI packaging, I would just add appropriate --config-settings option to pip
I am not familiar with this option. So not sure how it would help me in my particular case.
Here's an example. Hopefully the naming is enough to indicate what it does in that case.
How are the pure-python files supposed to work in that case, maybe there is a design using multi-python project like hatch + hatchling have in their repository.
I install the Python and C implementations side by side and import the fastest version available. You can see the implementation here: https://github.com/rapidfuzz/RapidFuzz/blob/8af875c1227ef0a034f46a1d6d46ab35c188d899/src/rapidfuzz/_utils.py#L109 and https://github.com/rapidfuzz/RapidFuzz/blob/main/src/rapidfuzz/fuzz.py
Essentially when the library is imported I try the following imports:
If the user specifically wants to get the C++ or Python variant he can override the behaviour by setting the RAPIDFUZZ_IMPLEMENTATION
environment variable before importing the library.
This allows everyone to get as fast of an implementation as possible without losing the compatibility provided by a pure Python implementation. Most users will get the C++ implementation through binary wheels or their package manager anyway. So this really only affects more exotic targets.
The downsides of the implementation are:
Why would it fail? This is defined in FindPython and it's just a wrapper of add_library. If you have an example of that failure can you share, and I'll try to forward it to upstream CMake?
I just assumed that's what's registering it to be compiled as well. All I really wanted to say is that I don't know how I would mark this using an ALLOW_FALLBACK
option so it doesn't fail on compilation failure. Really doesn't matter too much though since the fact that this requires cmake already rules this out
I install the Python and C implementations side by side and import the fastest version available.
Interesting, it's different to my situation when I want to link to either system or python bundled library. In your case I would definitely split into 2 packages making the pure python an optional dependency but preferred in import workflow (just to give the user maximum control). Dealing with AVX2
preference would be tricky though, didn't think far enough around that option. But even without changing the order I think it would be a good design to split in 3 packages: interface, compiled, pure-python (or combine interface and pure-python if your fallback allows it). It would also make it clearer for the user what version they want to use. Don't know if there is an appropriate way to use markers?
packaging tools that create a single binary with Cpython and all dependencies injected like pyinstaller won't find what they have to include
Didn't know about pyinstaller, and indeed it seems fairly dangerous. That seems like a nightmare to support even in pure-python.
I just assumed that's what's registering it to be compiled as well
Ah, I thought you were concerned about configure stage failures. If it's build stage, you have a bit more control on scikit-build-core if you filter specific build targets.
I think this can be supported with a few more override settings. We could provide:
if.failed
: If the build fails, the config settings gets rerun, and if any match this, the new configuration is used. This only happens once.if.no-system-cmake
: Would take two options, "known-wheel" or "always".error-message
: This would be override-only, and if it's present, the error message is printed.Should we allow failed
to take the condition of failure? I think the conditions would be "configure"
, "build"
, and "install"
. Though maybe it's better to just wait until someone asks for the ability to differentiate?
The cmake one would trigger if you were not on a platform with a known cmake wheel and there's no sufficient CMake present, or always if there's no system CMake. If this matches, then it affects if cmake is requested, as well.
This is what pure-python on non-supported CMake system would look like:
[[tool.scikit-build.overrides]]
if.no-system-cmake = "known-wheel"
wheel.cmake = false
Or if the build fails:
[[tool.scikit-build.overrides]]
if.failed = true
wheel.cmake = false
Custom message on failure:
[[tool.scikit-build.overrides]]
if.failed = true
error-message = "The build failed!"
I think this would add a lot of flexibility; for an unrelated example, you could fail on unsupported Pythons:
[[tool.scikit-build.overrides]]
if.python-version = ">=3.13"
error-message = "Python 3.13+ is not supported yet!"
I'm looking at a way to allow the cmake wheel to trigger the bootstrapping feature, and I think this would be a step in that direction too.
This is a very rough draft of what I'm thinking.
I assume with this I could implement my handling as:
[tool.scikit-build]
wheel.cmake = true
[[tool.scikit-build.overrides]]
if.no-system-cmake = "known-wheel"
wheel.cmake = false
[[tool.scikit-build.overrides]]
if.failed = true
wheel.cmake = false
[[tool.scikit-build.overrides]]
if.any.env.CIBUILDWHEEL = "1"
if.any.env.CONDA_BUILD = "1"
if.any.env.PIWHEELS_BUILD = "1"
if.any.env.RAPIDFUZZ_BUILD_EXTENSION = "1"
wheel.cmake = true
# whether this is required would probably depend on
# 1) is the override above triggered both for the failure + non failure case
# 2) what does the failure handling do if the same config is set both for failure and non failure
[[tool.scikit-build.overrides]]
if.failed = true
if.any.env.CIBUILDWHEEL = "1"
if.any.env.CONDA_BUILD = "1"
if.any.env.PIWHEELS_BUILD = "1"
if.any.env.RAPIDFUZZ_BUILD_EXTENSION = "1"
error-message = "failed to build C++ Extension in a packaged build"
I think we should make the error-message only appear if the build fails, and add a fail = true
option. Otherwise, looks good, I think?
Yes I think this would cover all of my needs and would be a lot more concise than my current implementation with scikit-build
:+1:
I will start a branch porting over the existing set of features and extend it as things get added to scikit-build-core
FYI,
if.any.env.CIBUILDWHEEL = true
Will do truthy/falsey values.
Any ideas for better naming on if.no-system-cmake = "known-wheel"
? I don't think that reads very well.
No clue from my side. To me if.system-cmake
reads like "did we use the cmake from system". Maybe if.cmake-vendor
can be a more general form of that, although vendor
sounds wrong to me here. And maybe if.have-cmake
as a catch-all for if any cmake is found?
To me
if.system-cmake
reads like "did we use the cmake from system".
That great, that's what it is supposed to be, except I'm not sure how to indicate the "and there's no known wheel for this platform" condition. if.no-system-cmake = "and-no-known-wheel"
?
To me if.(no-)system-cmake
reads more like it would be a bool
and would evaluate to (true)false
when a wheel is used. if.have-cmake
would be more explicit of no system cmake and no wheel.
It is recommended that projects migrate from
scikit-build
toscikit-build-core
when possible. Most development is focused there and it is a very CMake friendly and pythonic build system. For now I only wanted to create an issue to get your opinion on this and track the progress of such migration. If you have questions or need help, feel free to ask.