valboz / RTModel

Microlensing modeling: fast and efficient exploration of the parameter space
GNU Lesser General Public License v3.0
10 stars 4 forks source link

Cross platform build #1

Closed golmschenk closed 7 months ago

golmschenk commented 8 months ago

Hi Valerio! Stela Ishitani and Jon Brashear came to me to get RTModel working on macOS. I quickly threw together a working solution for them during our meeting, but it was somewhat system specific. So I took a little time after the meeting to put together a more general cross-platform solution. This pull request has those changes. However, there is a notable change to the build setup, so feel free to reject it if you are opposed to the change. The major difference is switching the build system from setuptools to scikit-build-core. setuptools is usually pretty finicky about cross platform builds, since it mostly has to be manually specified how to preform the build (compiler flags, etc). scikit-build-core uses CMake as the backend build system, which is able to abstract and automate away most cross platform compiling concerns. There are also some minor changes to the codebase (mostly updating deprecated headers and function calls).

To ensure I didn't break anything for other systems I also added a GitHub Actions script to spin up a CI runner for the major OSes to try compiling/installing on each of them and running a quick test to check that the executables were built in each case. The output of that will look like something similar to this result on my fork of RTModel.

In the future, it might be worth migrating these tests to use cibuildwheel, which isolates the builds and tests a bit more. It also provides the benefit that it will compile wheels for each OS which can be uploaded to PyPI. This will make is so that pip will give users pre-compiled wheels and the users will need to compile nothing locally, which usually helps prevent system specific issues. However, using scikit-build-core with cibuildwheel works better with a src layout project structure, and I didn't want to take the more drastic step of restructuring the project since it appears some things are somewhat hardcoded into the flat layout currently.

Let me know if you would prefer some modifications before accepting the pull request. Or if the changes are not to your preference, feel free to reject it as well.

valboz commented 7 months ago

Hi Greg

I am amazed by the work you did in such a short time. Indeed this is a difference between a professional and an astrophysicist. I have installed your version in my Windows laptop and it works!

I have tried to fully understand all your steps and everything looks great. I just complain that the documentation at https://scikit-build-core.readthedocs.io/ is quite incomplete.

For example why in the CMakeLists.txt you install the executables to ${SKBUILD_PLATLIB_DIR} and ${CMAKE_CURRENT_LIST_DIR} ? Is this double installation needed?

Valerio

golmschenk commented 7 months ago

The double install and lack of documentation on it is more my fault than scikit-build-core's. This was a bit of a hack I added to handle how RTModel looks for the executables and to try to make sure I wasn't breaking any of your workflows. Generally you would only need to install the compiled binaries to a single location.

The more "correct" option is to only install to the "${SKBUILD_PLATLIB_DIR}/RTModel/bin" . In this case, scikit-build-core provides the documentation of the important directories here. Note that scikit-build-core is mostly a glue between the main Python packaging system and CMake (that is, the platlib is part of Python and these install commands are part of CMake), and so most other relevant documentation can be found in those two places respectively. The main reason this install directory is the "correct" solution is that several other build tools will expect files to be installed to the platlib (site-packages) directory, and in many cases this might not always be the same as the source directory during different methods of building. This solution will mostly work already, except if you use editable installs during development (e.g., pip install -e .). Then the binary will be put into the site-packages directory, but RTModel is currently searching for the binary relative to the source file, which in an editable install will not be in the site-packages, resulting in it not being found during an editable install. One easy change to fix this, would be to have the Python code that searches for the binaries search the site-packages directory directly, instead of relative to the source files. Yet again, I didn't know if you were using these binaries manually not through Python, and are expecting to find them in the source directory (the site-packages directory is hidden away in the Python installation), so I added the quick hack of just copying putting them in both locations. If you are not using them directly, then installing only to the site-packages might be the better choice. Even if you are using them directly during development, they can also still be directly compiled calling CMake without going through pip. This would probably provide more clear debugging information and integrate with IDEs more easily anyway. This is what I was doing to work on the changes needed to make the C++ code more cross platform.

I added a commit removes the duplicated install, and instead only uses the site-packages directory (with the above noted adjustment to fix editable installs) to see if that solution looks better to you. I'm happy to help tweak it a bit further to better fit your use case if you need something different.

I chose scikit-build-core because CMake is the industry standard for C/C++ build systems, and so, regardless of what you want to do with it, it's easy to find examples of others having done it previously. That said, while CMake is extremely robust and capable, it is a bit of a monstrosity if you've never used it before. There are some cleaner, up and coming build systems (such as Meson via meson-python), but they don't have the same level of an existing knowledge base. Just as a bit of additional background, the Python developers relatively recently opened the build system of Python up, so that distutils/setuptools would no longer be the only option as they have significant limitations (such as easy cross platform distribution of C++ extensions). Though somewhat new, scikit-build-core is led by one of the Python Packaging Authority members, is NSF grant funded, and is the official successor to a well-established build system, so it's future should be quite stable. That said, if you'd prefer a non-CMake backend, there are other options that should provide easy cross platform builds.

valboz commented 7 months ago

Thanks for this very complete explanation Greg. In fact, I am quite happy about scikit-build, since it seems to solve platform issues with very little effort. I would be happy to fully understand all steps so that I will be able to reproduce them myself when needed.

Coming back to executables, I am not planning to make them directly available from outside the RTModel package. In fact, they can be launched by dedicated functions within RTModel, if needed.

One last concern may arise in the executables, because they expect to have access to the ./data directory starting from their launch directory, which is the bindir you modified. Can you confirm that they would always find the ./data directory with this configuration?

If everything works with sysconfig, I think we may also remove the import inspect library and the inspect.getfile line in RTModel.py.

Then everything should be ready for the pull request to be accepted!

Thanks again

Valerio

golmschenk commented 7 months ago

Indeed, I hadn't considered the C++ code using their relative imports to the data files. I believe this should only affect editable install cases though.

I added a test on the CI runners for a PS fitting (which should use the data files). Everything still seems to be passing with that on non-editable runs.

To fix the editable run case, I make the CMake build copy the data files to the platlib directory. This seems like a bit of a hack, but the last I saw, with the reworked Python build system structure, editable installs with compiled extensions are still a topic the Python developers are working on. And I don't believe there is much consensus on a "correct" solution for that yet. So until there is a consensus, this is probably a fine solution.

So, I believe this should all be fine to distribute now. But with the various relative imports and the ways different systems structure things, it's certainly possible I might not have considered something. Please let me know if you or anyone else seems to encounter an issue with this setup.

valboz commented 7 months ago

Hi Greg

I have now accepted the pull request, but there is a problem. I have cloned the GitHub repository and installed it on my laptop.

Unfortunately, RTModel does not find the executables. Apparently, I find that platdir is C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\site-packages

while the package was installed as simple user in c:\users\valboz\appdata\local\packages\pythonsoftwarefoundation.python.3.11_qbz5n2kfra8p0\localcache\local-packages\python311\site-packages

Can you check please? Note that I made some clean-up after the pull request. Maybe you need to fork again?

Valerio

golmschenk commented 7 months ago

Ah, sorry for the issue! This is something I should have expected. I'm guessing this is being installed to the user site-packages instead of the global site-packages (e.g., using pip install --user or some similar condition). I had forgotten there would be multiple site-packages directories. The quick workaround is just to check each of them and then use the correct one. I can add this shortly. I was traveling today, and will be trying to find a spot to view the eclipse without clouds tomorrow. But hopefully I'll have time after that. The more "correct" way to do this is through importlib, but I think this might require some restructuring of the project as well. And it's probably not that important, so just searching all the site-package locations will probably make sense for now. Sorry again for the issue!

valboz commented 7 months ago

Oh, clear skies then!

In the main branch I have just uncommented the inspect.getfile(RTModel) line for the moment. This works fine for me. Is there a reason to avoid the inspect library? It didn't work for MacOS?

Have a nice eclipse and take pictures!

Valerio

golmschenk commented 7 months ago

Using inspect.getfile(RTModel) should work fine for most purposes (including across different OSes). This method does work on macOS from the user perspective. The place that it will encounter issues is with some developer setups and other CI tools. However, if you're not using any of these, you can postpone worrying about this until it actually becomes a problem in some way.

For example, editable installs (via something like pip install --editable .) will place the built files into site-packages, but inspect.getfile(RTModel) will still refer to the original source file location, so the relative pathing won't work. Similarly, I suspect there might be complications with tools like cibuildwheel when it creates the precompiled wheels, as it might have a slightly different packaging of the source files relative to the compiled wheel files. These issues might be inconsistent across different systems. However, if you're not using any of these setups at the moment, maybe it's best to only fix them when they actually show themselves to be a problem.

valboz commented 7 months ago

Hi Greg, it seems that one user opened an issue regarding the installation on a macOS. I realize that the pip install version cannot work even on Colab, since it only works under Windows or Linux. But I do not understand why the GitHub version does not work on his macOS. Do you have time to check?

golmschenk commented 7 months ago

I’ll take a look!