Open llandsmeer opened 1 year ago
Hi @llandsmeer,
this pretty much aligns with my ideas I've outlined in the relevant issues before (eg #1990). A few comments
ELF magic will not work on MacOS; use a static function at catalogue level that returns that salient information, cf our private conversation on the topic. I'd suggest these fields at minimum
By extension, it's probably useful to also use a suffix string to the file as a first line of defense and extend
load_catalogue
to transparently select over these, eg default-catalogue-darwin-arm64.so
.
Yes, as said before, but only this toplevel call. However, I'd suggest making a C++ binding first.
This is the most ambitious of these goals. I put it up in #1990 as almost a side note, but be aware that if this is to be robust, it's a ton of work.
#defines
for SIMD/GPUAll of this needs to work even if itself (=modcc
) or its install tree was moved. Both happen during wheel building
and pip installing (via skbuild
). CMake will at least move modcc
. None of these cases are guaranteed to be stable
across OS, tool version, and platform (see homebrew
going from X86_64 to ARM64 MacOS).
Honestly, I am not even sure any longer that this is worth the hassle just to get rid of CMake.
Due to different ways these imports / prefixes etc works, this often goes wrong.
- import arbor looks in $PYTHONPATH and sys.path
- arbor-build-catalogue is run from $PATH
- modcc is run from arbor.config()['PREFIX']
The original idea was that one arbor-build-catalogue
in $PATH
could service multiple installations of Arbor.
Of course, it should have been consistent. Again, our bane are the relocations enacted by wheel/skbuild.
Currently, arbor-build-catalogue (a-b-c) is a python script. It uses the python API of arbor, the arbor cmake config file and then cmake, to find the right compiler to call and where the right modcc binary is installed. Due to different ways these imports / prefixes etc works, this often goes wrong.
import arbor
looks in $PYTHONPATH and sys.patharbor-build-catalogue
is run from$PATH
modcc
is run fromarbor.config()['PREFIX']
Thus, these could all point to different arbor installs. Not if you use these tools correctly, but it's very easy to do something wrong. For example, if you modify
sys.path
before yourimport arbor
, thenarbor-build-catalogue
will import a different arbor library. Or if you install python from a wheel package, then the compiler and prefix from the build environment will stay in the config() result, which then requires a fallback toc++
or user configuration.To mitigate, the proposal is to
Make
arbor-build-catalogue
a part of modccIn that way, we don't have to search for modcc from arbor-build-catalogue. Ideally, the right flags (there aren't that much of them) are defined here as well, removing the cmake dependency
Integrate
modcc
into the python libraryIn this way, we don't have to find the arbor-build-catalogue script or modcc binaries from a python script, which is very error prone. Currently, most python scripts that shell out to a-b-c perform some kind of error prone do-i-need-to-recompile-check by iterating over files and checking against timestamps (eg these often don't check for arbor version updates). These could all be captured into a single
arbor.compile_catalogue(dir, force_recompile=False)
callAdding an catalogue API version field to the builded catalogues and load the catalogue.so lazily
Currently, when you load a catalogue .so file for a different arbor version this would fail hard in strange ways. Eg with the addition of SDEs in arbor, you'll get errors about missing symbols if you load an older catalogue file compiled by a wrongly found modcc. A solution could be to add a function / ELF header field / append a magic string at the file's end that tells the API version and load the catalogue using RTLD_LAZY. It's already there in some form of course, but the current implementation doesn't catch additions of new functions