arbor-sim / arbor

The Arbor multi-compartment neural network simulation library.
https://arbor-sim.org
BSD 3-Clause "New" or "Revised" License
108 stars 60 forks source link

Merge a-b-c into modcc #2063

Open llandsmeer opened 1 year ago

llandsmeer commented 1 year ago

Currently, arbor-build-catalogue (a-b-c) is a python script. It uses the python API of arbor, the arbor cmake config file and then cmake, to find the right compiler to call and where the right modcc binary is installed. Due to different ways these imports / prefixes etc works, this often goes wrong.

Thus, these could all point to different arbor installs. Not if you use these tools correctly, but it's very easy to do something wrong. For example, if you modify sys.path before your import arbor, then arbor-build-catalogue will import a different arbor library. Or if you install python from a wheel package, then the compiler and prefix from the build environment will stay in the config() result, which then requires a fallback to c++ or user configuration.

To mitigate, the proposal is to

Make arbor-build-catalogue a part of modcc

In that way, we don't have to search for modcc from arbor-build-catalogue. Ideally, the right flags (there aren't that much of them) are defined here as well, removing the cmake dependency

Integrate modcc into the python library

In this way, we don't have to find the arbor-build-catalogue script or modcc binaries from a python script, which is very error prone. Currently, most python scripts that shell out to a-b-c perform some kind of error prone do-i-need-to-recompile-check by iterating over files and checking against timestamps (eg these often don't check for arbor version updates). These could all be captured into a single arbor.compile_catalogue(dir, force_recompile=False) call

Adding an catalogue API version field to the builded catalogues and load the catalogue.so lazily

Currently, when you load a catalogue .so file for a different arbor version this would fail hard in strange ways. Eg with the addition of SDEs in arbor, you'll get errors about missing symbols if you load an older catalogue file compiled by a wrongly found modcc. A solution could be to add a function / ELF header field / append a magic string at the file's end that tells the API version and load the catalogue using RTLD_LAZY. It's already there in some form of course, but the current implementation doesn't catch additions of new functions

thorstenhater commented 1 year ago

Hi @llandsmeer,

this pretty much aligns with my ideas I've outlined in the relevant issues before (eg #1990). A few comments

RTLD_LAZY

ELF magic will not work on MacOS; use a static function at catalogue level that returns that salient information, cf our private conversation on the topic. I'd suggest these fields at minimum

By extension, it's probably useful to also use a suffix string to the file as a first line of defense and extend load_catalogue to transparently select over these, eg default-catalogue-darwin-arm64.so.

modcc in python

Yes, as said before, but only this toplevel call. However, I'd suggest making a C++ binding first.

Make arbor-build-catalogue a part of modcc

This is the most ambitious of these goals. I put it up in #1990 as almost a side note, but be aware that if this is to be robust, it's a ton of work.

All of this needs to work even if itself (=modcc) or its install tree was moved. Both happen during wheel building and pip installing (via skbuild). CMake will at least move modcc. None of these cases are guaranteed to be stable across OS, tool version, and platform (see homebrew going from X86_64 to ARM64 MacOS).

Honestly, I am not even sure any longer that this is worth the hassle just to get rid of CMake.

About the original problem

Due to different ways these imports / prefixes etc works, this often goes wrong.

  • import arbor looks in $PYTHONPATH and sys.path
  • arbor-build-catalogue is run from $PATH
  • modcc is run from arbor.config()['PREFIX']

The original idea was that one arbor-build-catalogue in $PATH could service multiple installations of Arbor. Of course, it should have been consistent. Again, our bane are the relocations enacted by wheel/skbuild.