AMReX-Codes / amrex

AMReX: Software Framework for Block Structured AMR
https://amrex-codes.github.io/amrex
Other
515 stars 338 forks source link

Windows (Shared): `omp_locks` failed Symbol #3795

Open ax3l opened 4 months ago

ax3l commented 4 months ago

This is a Windows (shared/DLL) specific build regression to #3696.

Since this PR / the 24.02 release, we cannot build WarpX and ImpactX on Conda-Forge with a shared AMReX dependency anymore, due to this error:

[364/744] C:\Windows\system32\cmd.exe /C "cd . && %BUILD_PREFIX%\Library\bin\cmake.exe -E vs_link_exe --intdir=CMakeFiles\app_2d.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\x64\mt.exe --manifests  -- %BUILD_PREFIX%\Library\bin\lld-link.exe  CMakeFiles\app_2d.dir\Source\main.cpp.obj  /out:bin\warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.exe /implib:lib\warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.lib /pdb:bin\warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.pdb /version:0.0 /machine:x64 /INCREMENTAL:NO /subsystem:console  lib\libwarpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.lib  lib\ablastr_2d.lib  %PREFIX%\Library\lib\amrex_2d.lib  %PREFIX%\Library\lib\libomp.lib  %PREFIX%\Library\lib\fftw3.lib  %PREFIX%\Library\lib\openPMD.lib  %PREFIX%\Library\lib\adios2_cxx11.lib  lib\buildInfopyWarpX_2d.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && C:\Windows\system32\cmd.exe /C "cd /D %SRC_DIR%\build && %BUILD_PREFIX%\Library\bin\cmake.exe -E create_symlink warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.exe D:/bld/warpx_1706926672986/work/build/bin/warpx.2d && cd /D %SRC_DIR%\build && %BUILD_PREFIX%\Library\bin\cmake.exe -E create_symlink warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.exe D:/bld/warpx_1706926672986/work/build/bin/warpx.2d""
FAILED: bin/warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.exe 
C:\Windows\system32\cmd.exe /C "cd . && %BUILD_PREFIX%\Library\bin\cmake.exe -E vs_link_exe --intdir=CMakeFiles\app_2d.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\x64\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\x64\mt.exe --manifests  -- %BUILD_PREFIX%\Library\bin\lld-link.exe  CMakeFiles\app_2d.dir\Source\main.cpp.obj  /out:bin\warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.exe /implib:lib\warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.lib /pdb:bin\warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.pdb /version:0.0 /machine:x64 /INCREMENTAL:NO /subsystem:console  lib\libwarpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.lib  lib\ablastr_2d.lib  %PREFIX%\Library\lib\amrex_2d.lib  %PREFIX%\Library\lib\libomp.lib  %PREFIX%\Library\lib\fftw3.lib  %PREFIX%\Library\lib\openPMD.lib  %PREFIX%\Library\lib\adios2_cxx11.lib  lib\buildInfopyWarpX_2d.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && C:\Windows\system32\cmd.exe /C "cd /D %SRC_DIR%\build && %BUILD_PREFIX%\Library\bin\cmake.exe -E create_symlink warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.exe D:/bld/warpx_1706926672986/work/build/bin/warpx.2d && cd /D %SRC_DIR%\build && %BUILD_PREFIX%\Library\bin\cmake.exe -E create_symlink warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.exe D:/bld/warpx_1706926672986/work/build/bin/warpx.2d""
LINK: command "%BUILD_PREFIX%\Library\bin\lld-link.exe CMakeFiles\app_2d.dir\Source\main.cpp.obj /out:bin\warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.exe /implib:lib\warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.lib /pdb:bin\warpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.pdb /version:0.0 /machine:x64 /INCREMENTAL:NO /subsystem:console lib\libwarpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.lib lib\ablastr_2d.lib %PREFIX%\Library\lib\amrex_2d.lib %PREFIX%\Library\lib\libomp.lib %PREFIX%\Library\lib\fftw3.lib %PREFIX%\Library\lib\openPMD.lib %PREFIX%\Library\lib\adios2_cxx11.lib lib\buildInfopyWarpX_2d.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST:EMBED,ID=1" failed (exit code 1) with the following output:
lld-link: error: undefined symbol: __declspec(dllimport) class std::array<struct omp_lock_t, 128> amrex::OpenMP::omp_locks
>>> referenced by libwarpx.2d.NOMPI.OMP.DP.PDP.OPMD.PSATD.QED.lib(WarpXParticleContainer.cpp.obj):(public: class amrex::BaseFab<double> & __cdecl amrex::BaseFab<double>::lockAdd<1>(class amrex::BaseFab<double> const &, class amrex::Box const &, class amrex::Box const &, int, int, int))

It is not clear to me why this happens. I checked:

Maybe we need to try a standard C array instead?

ax3l commented 4 months ago

I think in practice, omp_lock_t is always used as a pointer omp_lock_t*, with not much details what the type implementation is used for.

ChatGPT has this opinion:

One problem might be that omp_lock_t are not guaranteed to be copyable or movable, which are usually requirements for types to be stored in standard C++ containers like std::array, std::vector, etc. Like a raw C array, std::array requires its elements to be default constructible, copyable, and assignable because it relies on these operations for initialization and element access. [...] Since omp_lock_t lacks these operations (as duplicating a lock could lead to undefined behavior), it cannot be directly stored in a std::array.

It suggests to use a std::vector<std::unique_ptr<omp_lock_t>> omp_locks or omp_lock_t omp_locks[nlocks];.

ax3l commented 3 months ago

WarpX does not add template lockAdd<runOn...> in its usage of lockAdd... https://github.com/AMReX-Codes/amrex/blob/2cd72a3bbaeb9e44475225e4acb02b35084e19da/Tests/OpenMP/atomicAdd/main.cpp#L37 https://github.com/search?q=repo%3AECP-WarpX%2FWarpX%20lockAdd&type=code

Even though it has a default and should not need it, maybe that confuses MSVC...

We could try that as a patch to https://github.com/conda-forge/warpx-feedstock/pull/82#pullrequestreview-1985133315

ax3l commented 3 months ago

With the new test added in AMReX via #3805, I suspect a toolchain issue in conda-forge causes this....

ax3l commented 3 months ago

@isuruf noticed: On Conda-Forge's builds, the AMReX DLL has void* * __cdecl amrex::OpenMP::get_lock(int), but not struct omp_lock_t * __cdecl amrex::OpenMP::get_lock(int).

Might be technically an alias, but is weird: https://github.com/llvm/llvm-project/blob/llvmorg-18.1.4/openmp/runtime/src/include/omp.h.var#L82-L85

I can reproduce this: if I inspect the latest .dll with radare2 I also see:

$ rabin2 -s amrex_3d.dll
...
3352 0x00054390 0x180054f90 GLOBAL FUNC 0    amrex_3d.dll
    ?get_lock@OpenMP@amrex@@YAPEAPEAXH@Z
    void * __ptr64 * __ptr64 __cdecl amrex::OpenMP::get_lock(int)
...
19   0x003e97c0 0x1803ea3c0 NONE   FUNC 0    libomp.dll                            imp.omp_destroy_lock
20   0x003e97c8 0x1803ea3c8 NONE   FUNC 0    libomp.dll                            imp.omp_get_max_threads
21   0x003e97d0 0x1803ea3d0 NONE   FUNC 0    libomp.dll                            imp.omp_get_num_threads
22   0x003e97d8 0x1803ea3d8 NONE   FUNC 0    libomp.dll                            imp.omp_get_thread_num
23   0x003e97e0 0x1803ea3e0 NONE   FUNC 0    libomp.dll                            imp.omp_in_parallel
24   0x003e97e8 0x1803ea3e8 NONE   FUNC 0    libomp.dll                            imp.omp_init_lock
25   0x003e97f0 0x1803ea3f0 NONE   FUNC 0    libomp.dll                            imp.omp_set_num_threads
...
ax3l commented 3 months ago

We could try to compile with /NODEFAULTLIB:VCOMP or so, maybe there is some confusion with the LLVM OpenMP lib that we use instead.

https://learn.microsoft.com/en-us/cpp/parallel/openmp/reference/openmp-library-reference?view=msvc-170

But we use the same in AMReX CI without issues... Update: no wait, we are compiling with Clang-CL not MSVC in both cases.

ax3l commented 3 months ago

We can also try to switch from Clang-CL back to MVSC and the new LLVM support in it: https://learn.microsoft.com/en-us/cpp/build/reference/openmp-enable-openmp-2-0-support?view=msvc-170 https://gitlab.kitware.com/cmake/cmake/-/issues/25570

WeiqunZhang commented 3 months ago

@ax3l Could we try this? #3910