scikit-build / scikit-build-core

A next generation Python CMake adaptor and Python API for plugins
https://scikit-build-core.readthedocs.io
Apache License 2.0
243 stars 51 forks source link

Question installing cmake-built shared library to fixed destination in editable mode #374

Open abetlen opened 1 year ago

abetlen commented 1 year ago

Hi there, first of all thank you for all the amazing work.

I'm currently maintaining some python projects that bind to c++ code in a shared library via the python ctypes module. I have been using scikit-build with a setup.py file to build the c++ shared library and then place it relative to the python source files where the package can find it during import.

I've been working on migrating from scikit-build to scikit-build-core and have been able to get the project installing correctly by setting the output destination of the shared library file relative to ${SKBUILD_PLATLIB_DIR}. However when the package is installed in editable mode such as during development this does not copy to shared library file to the same location relative to the python files.

Is it possible to detect in cmake when the project is built in editable mode and change the output destination accordingly? Alternatively, is there some better way for me to place and find this shared library file that I'm completely missing (tried importlib.resources but no luck)?

LecrisUT commented 1 year ago

In spglib, I simply install directly to the relative path, and they I just setup wheel.install-dir. I am not sure if this is the intended installation method, but it does work with editable mode.

Is it possible to detect in cmake when the project is built in editable mode and change the output destination accordingly?

Iirc, editable.rebuild will trigger a cmake rebuild whenever you import the python package. One way to ensure you are always pulling the correct version of the python bindings is to link only to static libraries (the python binding should still be a MODULE as far as I understand now). Otherwise scikit-build-core should add an appropriate value for CMAKE_INSTALL_RPATH. (In spglib I implemented an unconventional approach to prioritize linking to the system library. Maybe there's a cleaner way to do that though).

Hope the tips help.

PS: can you link me to the project? I am looking for some inspiration for C++ python binding that has proper usage of classes and std library.

abetlen commented 1 year ago

Hey @LecrisUT thanks for the suggestions, I took a look but no immediate luck, I'll try some of them again tomorrow. Maybe I'm doing something non-standard because I'm not building a binding interface like you are in _spglib.c.

The two projects I'm trying to migrate to scikit-build-core are llama-cpp-python (currenty scikit-build with a setup.py file) and ggml-python (scikit-build-core with a workaround that copies shared libary to both valid locations). In both projects trying to build a shared library in a git submodule where submodule already has a build target I can reference from the CMakeLists file. I have interfaces written using ctypes so I don't need any seperate C / C++ source files in my own projects.

LecrisUT commented 1 year ago

Ok, I kinda get the gist of what you do there with llama-cpp-python. That one is similar to how I load the spglib C library. I see that you are installing to the llama_cpp package folder, so you can try with an empty wheel.install-dir. If that doesn't work, put scikit-build-core in debug mode and see where cmake installs to and where sk-core moves them to.

Edit: I should mention this, in editable mode, sk-core does install the libraries in a fixed location, but it makes redirection calls for python binding libraries that redirects calls to import _spglib. I am not sure how it does with other libraries, but you need an import ... interface so that it is redirected properly in editable mode.

After you install in editable mode, there is a file related to the name of the package installed that tells where everyty is redirected, including the installed cpp libraries.

LecrisUT commented 1 year ago

I hope you don't mind me commenting, but after some thought I think it's dangerous to use a direct call to import the libraries and directly call and assume the structure of the cpp library. Mostly because there is only linking to the file name and it is not linked to the library version. When the libraries are linked between each other, the soname, version etc. are baked in the linkage (not sure how to do version specific linkage in cmake though).

If you are using a python binding module file, then you are guaranteed to:

henryiii commented 1 year ago

The downside to making a normal module instead is that then you have to build one wheel per python version, while the ctypes version doesn't care about what Python version you are using.

Does your method (loading the so directly) work when run through auditwheel? I'm guessing it doesn't mangle the names of the shared SO, and instead you'd just not be able to bundle your lib into two different wheels?

In general, you should never rely on the file layout. Python happens to default to laying out things on a file system, but it's not required. It could be inside a zip file. I've even seen people use a database instead of a real filesystem (common in banking, IIRC). Instead, you should always use importlib.resources, and I think that should still work in editable mode. I think. I'll try to look at what you are doing soon.

henryiii commented 1 year ago

Do you have a PR or branch with your attempt that mostly works except for editable mode on one of the projects? (whichever one is simpler :) )

LecrisUT commented 1 year ago

@abetlen, here is a snippet from spglib when I run pip install -e .

install({'spglib.spglib': '/home/lecris/mpsd/Projects/spglib/python/spglib/spglib.py',
         'spglib': '/home/lecris/mpsd/Projects/spglib/python/spglib/__init__.py'},
        {'spglib._spglib': 'spglib/_spglib.cpython-311-x86_64-linux-gnu.so',
           'spglib.spglib': 'spglib/spglib.h',
         'spglib.libsymspg': 'spglib/libsymspg.so.2.1.0'}, None, False, True, [], [])

Notice how spglib.libsymspg is being aliased. I believe that if you use importlib.resources it will use the alias when running in editable mode.

But the the question is how do you use importlib.resources to prioritize the system installed one over the bundled one?

Also this use-case will be nice to add as an example @henryiii once all the nuances are understood.

henryiii commented 1 year ago

Ah, we might not support importlib.resources properly yet. I tried something that should work and it's not loading the correct file, so I think we need to implement get_resource_reader.

abetlen commented 1 year ago

Thaks @LecrisUT and @henryiii

I aggree that the current solution is probably brittle and having the additional resolution logic included in imports is prefferable, I might consider using pybind11 here if there's an easy way to re-export the existing C APIs without too much boilerplate.

Currently the libraries I'm linking against (llama.cpp and ggml) don't have stable releases / APIs so it doesn't make sense really to search LD_LIBRARY_PATH. In terms of wheels, I'm actually only releasing source distributions, this is because the libraries make use of very specific performance optimizatoins which I can't build individual wheels for.

The ggml-python main branch has my best attempt at migrating to scikit-build-core there's currently a hack workaround where I copy the ggml shared library to both the SKBUILD_PLATLIB_DIR and CMAKE_CURRENT_SOURCE_DIR (to get it working for editable model).

LecrisUT commented 1 year ago

if there's an easy way to re-export the existing C APIs without too much boilerplate.

That would be swig and there is an example provided ;). Just make an appropriate input/header file that contains all of the apis you want to export to python.

In terms of wheels, I'm actually only releasing source distributions, this is because the libraries make use of very specific performance optimizatoins which I can't build individual wheels for.

Indeed that seems like a reasonable implementation that you had there. Hopefully my spglib example is useful to show how to achieve that with python modules since my goal there is the same. If you find anything along those lines let me know about it as well.

henryiii commented 1 year ago

Yes, if you are not using the one binary for all Pythons feature, I'd probably go for something like swig, pybind11, or nanobind.

Though I'll try to work on making importlib.metadata work soon(ish).

abetlen commented 1 year ago

@henryiii I appreciate it, for now I'll keep my ctypes approach with the workaround of copying the shared library to both locations but I have moved over to using importlib.resources to get the full file path instead of expecting it to be relative to the python files.

I've also started experimenting with swig (early) and I'll give pybind11 / nanobind a try as well.

ajay9438 commented 11 months ago

(privateGPT) C:\Users\AJAY\Desktop\PrivateGpt>$env:CMAKE_ARGS = "-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" The filename, directory name, or volume label syntax is incorrect.

(privateGPT) C:\Users\AJAY\Desktop\PrivateGpt>pip install llama-cpp-python Collecting llama-cpp-python Using cached llama_cpp_python-0.2.20.tar.gz (8.7 MB) Installing build dependencies ... done Getting requirements to build wheel ... done Installing backend dependencies ... done Preparing metadata (pyproject.toml) ... done Requirement already satisfied: typing-extensions>=4.5.0 in c:\users\ajay\anaconda3\envs\privategpt\lib\site-packages (from llama-cpp-python) (4.8.0) Requirement already satisfied: numpy>=1.20.0 in c:\users\ajay\anaconda3\envs\privategpt\lib\site-packages (from llama-cpp-python) (1.26.0) Collecting diskcache>=5.6.1 (from llama-cpp-python) Using cached diskcache-5.6.3-py3-none-any.whl.metadata (20 kB) Using cached diskcache-5.6.3-py3-none-any.whl (45 kB) Building wheels for collected packages: llama-cpp-python Building wheel for llama-cpp-python (pyproject.toml) ... error error: subprocess-exited-with-error

× Building wheel for llama-cpp-python (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [20 lines of output] scikit-build-core 0.6.1 using CMake 3.27.7 (wheel) Configuring CMake... 2023-11-30 18:40:30,114 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None loading initial cache file C:\Users\AJAY\AppData\Local\Temp\tmpkh6aqsh2\build\CMakeInit.txt -- Building for: NMake Makefiles CMake Error at CMakeLists.txt:3 (project): Running

     'nmake' '-?'

    failed with:

     The system cannot find the file specified

  CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
  CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
  -- Configuring incomplete, errors occurred!

  *** CMake configuration failed
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for llama-cpp-python Failed to build llama-cpp-python ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects