evaleev / libint

Libint: high-performance library for computing Gaussian integrals in quantum mechanics
Other
218 stars 97 forks source link

new cmake harness, round 3 #233

Open loriab opened 2 years ago

loriab commented 2 years ago

~At present, this PR is so EFV can oversee proposed changes to #205. It configures, but I haven't tried so much as building the compiler.~

Added 19 Jan

WIP c. 19 Jan

Added 20 Jan

Added 21 Jan

Added 24 Jan

Added 28 Jan

Added 12 Feb

WIP

evaleev commented 2 years ago

@loriab CI passes!

how painful would be to revert to cmake 3.16?

loriab commented 2 years ago

@loriab CI passes!

Thanks!

How painful would be to revert to cmake 3.16?

It won't be too bad. I figured I'd keep it at 3.18 for int_am readability for now (2 lines to compute each max instead of ~7) then revert to 3.16 at the end. I have updated some Python and Boost things that could be used only >=3.15.

loriab commented 2 years ago

Sorry to leave CI broken after you so kindly fixed it. This shows a strategy for separating machine-path-dependent targets (that you described the need for but are unsuitable for distribution) from the usual redistributable targets. This also fixes the full paths in the installed target tree for the latter. I'm still working on shifting eigen3 detection, so CI has no chance at the moment.

evaleev commented 2 years ago

@loriab no worries

BTW I am not sure I understand why we need to have the "nonlocal" (path-independent) targets. If you allow the user to discover Eigen3 on their own or provide their own Eigen3 target there is no way to make sure that it is configured correctly (e.g. the one used by Libint to pre-instantiate Engine used Eigen3 configured to need BLAS/LAPACK ... there is no way to test Eigen3's configuration without much try_compile code).

This is a common scenario for C++ and even for C code. E.g. CMake's standard MPI targets are ABI-specific, so it is not possible to say find_package(MPI COMPONENT mpich_abi ... ) or find_package(MPI COMPONENT sgi_abi ... ) ... the only way to avoid ABI problems in general is to hardwire to the particular instance of a target.

Is this a requirement motivated by package maintenance??

loriab commented 2 years ago

BTW I am not sure I understand why we need to have the "nonlocal" (path-independent) targets. If you allow the user to discover Eigen3 on their own or provide their own Eigen3 target there is no way to make sure that it is configured correctly (e.g. the one used by Libint to pre-instantiate Engine used Eigen3 configured to need BLAS/LAPACK ... there is no way to test Eigen3's configuration without much try_compile code).

This is a common scenario for C++ and even for C code. E.g. CMake's standard MPI targets are ABI-specific, so it is not possible to say find_package(MPI COMPONENT mpich_abi ... ) or find_package(MPI COMPONENT sgi_abi ... ) ... the only way to avoid ABI problems in general is to hardwire to the particular instance of a target.

Is this a requirement motivated by package maintenance??

Yes, I'm thinking of packaging and accommodating the up to three computers that build library source, build library, and build library consumer. All I really want in the end is:

I think I'm reading code right that the Libint2 cxx interface at present is header only. So the detection at library-build time in src/lib/libint/CMakeLists.txt.export is perfunctory so that the eigen target can be recorded in libint2-config.cmake as a dependency of Libint2::cxx interface library target. The eigen code only gets compiled within the libint repo for the tests that use the cxx interface. If I have all that right, I feel on pretty firm cmake footing that there shouldn't be full paths in the config file and one doesn't generally install external dependency targets (https://cmake.org/cmake/help/latest/guide/importing-exporting/index.html#creating-relocatable-packages). The latter is unusual enough that there's a new v3.21 capability to allow it (https://cmake.org/cmake/help/latest/command/install.html#installing-runtime-dependencies). I'd expect the eigen detection to happen in a find_dependency(eigen-like) in the libint-config file. I'm working on wrapping the call so it allows plain header location and avoiding the build tree, not just the discoverability of Eigen3Config.cmake . In this libint2-cxx-interface-header-only case, this eigen code detected at built-libint detection time is the only eigen code involved, so no need to match up exact installations.

Your plan of non-header-only Libint2 cxx interface surprised me. I gather that the int-cxx-shared target becomes a tangible library with compiled code, not just an interface library. I accept that the same eigen code and offloading defines ought to be used for both libint-building and libint-consumer-building. This is fairly easy when packagers control both the libint2-building and the e.g., psi4-building steps because they'll use common eigen conditions. I'm not sure how it should work for the more general cmake case, though. Just vendorizing the eigen code into include/libint2/eigen/ doesn't work because it's not just the eigen code you want shared between libint2 and the qc program but also the offloading defines and attendant library linking. Since you seemed to assume that libint2 and QC consumer were build on the same computer anyways by the full paths to eigen, I figured a workaround to preserve normal cmake relocatability for the header-only cxx interface and accommodate the common-computer non-header-only cxx interface you describe would be to separate the targets so packagers can decline to install the latter while users can request the latter.

I hope I'm understanding your points and plans better in these posts. Mostly I've been focusing on the existing header-only state. Thanks for you patience. Pinging @susilehtola in case there's a packaging aspect/constraint that I'm missing, being mostly conda-focused.

evaleev commented 2 years ago

@loriab due to the header-only inefficiencies the plan to make C++ API a proper (compiled) library has been around for awhile (for as long as engine.impl.h existed). There will be no impact on current API users (like Psi), and switching to the compiled version will be trivial: cxx to be a header-only C++API and cxx_{shared,static} a compiled version (for now only including Engine, eventually might include more).

I agree that for careful users the ABI issues will not appear. But I'm just giving you a heads-up for what's likely to happen (and has happened to me multiple times before ... so experts are not immune). Since ABI mismatches are impossible to detect at build-time and in general extremely painful to diagnose the ABI mismatches in my book are more evil than hardwiring to the particular target instance ... perhaps the hardwiring should be the default with an opt-out for experts? OR the other way around? (the latter makes less sense to me: the only reason to hardwire search paths is for an average user, not an expert).

P.S. Eigen is just an fn pain. I'd love to be able to just restrict to config-equipped Eigen but the amount of pushback I get along the lines of "but Eigen is a header-only library!!!! I should not need to install it" is tiresome.

evaleev commented 2 years ago

@loriab btw I made the int-cxx-{static,shared} a compiled library. The header-only (interface) version is int-cxx-headeronly-{static,shared}. Should probably not be used by anyone since cmake is now the only way the C++ interface can be used by mortals

loriab commented 2 years ago

@loriab btw I made the int-cxx-{static,shared} a compiled library. The header-only (interface) version is int-cxx-headeronly-{static,shared}.

Good, thanks, it'll be nice to look at something concrete.

If both int-cxx-headeronly-{static,shared} and int-cxx-{static,shared} targets are available for downstream to choose, that'll work well for me. Psi (for now) will probably stick with the headeronly since that can be distributed safely, eigen-wise. (Thanks for the heads-up on the ways eigen can go wrong when spread across two build-times.) And libint consumers building libint and downstream on the same machine can take advantage of the compiled int-cxx-{static,shared} targets where the full path to local eigen is exported.

Should probably not be used by anyone since cmake is now the only way the C++ interface can be used by mortals

... unless this sentence refers to the header-only target, in which case I'm not following.


I'll get your #205 changes rebased through and work from there.

susilehtola commented 2 years ago

Fedora wise I am not aware of any issues other than the issues of incompatible orderings, normalizations etc. Any dependencies like Eigen3 will be pulled from the Fedora repositories, so everything is built in a consistent manner. MPI versions are compiled separately for MPICH and OpenMPI, although I'm not sure why Libint would need that? ABI incompatibilities (changing compile time options) are handled by a ABI version tag in the RPM spec file, so if the options need to be changed then all dependent packages are rebuilt to satisfy the broken dependencies.

loriab commented 2 years ago

@evaleev, this PR isn't ready for review yet, but if you have a chance, I'd appreciate if you could look over this docs table https://github.com/loriab/libint/blob/new-cmake-harness-lab-rb1/INSTALL.md#prerequisites and footnotes for correctness. I'm sorry that it's a little hard to edit -- GH web interface is probably easiest or just let me know the corrections. Thanks!

loriab commented 2 years ago

@evaleev, I think this is ready for review. I'm still working on the INSTALL.md docs, but I'm satisfied with the technical situation. If you approve of the general structure and solutions, I'd like to advise packagers that it's ready to test soon, as the problems they find are probably in my court rather than yours. Let me know if you'd like to Zoom to talk over reasoning and concerns in real time for efficiency.