NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.63k stars 13.79k forks source link

Build failure: cross compilation onnxruntime #306042

Open Cryolitia opened 5 months ago

Cryolitia commented 5 months ago

Steps To Reproduce

nix build nixpkgs#pkgsCross.aarch64-multiplatform.onnxruntime -v -L on x86_64-linux

Build log

onnxruntime-aarch64-unknown-linux-gnu> CMake Warning at CMakeLists.txt:1559 (message):
onnxruntime-aarch64-unknown-linux-gnu>   MPI and NCCL disabled on Win build.
onnxruntime-aarch64-unknown-linux-gnu>
onnxruntime-aarch64-unknown-linux-gnu> -- Looking for clock_gettime in rt
onnxruntime-aarch64-unknown-linux-gnu> -- Looking for clock_gettime in rt - found
onnxruntime-aarch64-unknown-linux-gnu> -- Python Build is enabled
onnxruntime-aarch64-unknown-linux-gnu> -- Found pybind11: /nix/store/6awycaxs0a0q53yxlnwbq90vn4n3ggfr-python3.11-pybind11-2.12.0-aarch64-unknown-linux-gnu/include (found version "")
onnxruntime-aarch64-unknown-linux-gnu> -- Configuring done (7.7s)
onnxruntime-aarch64-unknown-linux-gnu> CMake Error at onnxruntime_unittests.cmake:61 (target_link_libraries):
onnxruntime-aarch64-unknown-linux-gnu>   Target "onnxruntime_test_all" links to:
onnxruntime-aarch64-unknown-linux-gnu>     GTest::gtest
onnxruntime-aarch64-unknown-linux-gnu>   but the target was not found.  Possible reasons include:
onnxruntime-aarch64-unknown-linux-gnu>     * There is a typo in the target name.
onnxruntime-aarch64-unknown-linux-gnu>     * A find_package call is missing for an IMPORTED target.
onnxruntime-aarch64-unknown-linux-gnu>     * An ALIAS target is missing.
onnxruntime-aarch64-unknown-linux-gnu> Call Stack (most recent call first):
onnxruntime-aarch64-unknown-linux-gnu>   onnxruntime_unittests.cmake:820 (AddTest)
onnxruntime-aarch64-unknown-linux-gnu>   CMakeLists.txt:1693 (include)
onnxruntime-aarch64-unknown-linux-gnu>
onnxruntime-aarch64-unknown-linux-gnu> CMake Error at onnxruntime_unittests.cmake:57 (target_link_libraries):
onnxruntime-aarch64-unknown-linux-gnu>   Target "onnxruntime_shared_lib_test" links to:
onnxruntime-aarch64-unknown-linux-gnu>     GTest::gtest
onnxruntime-aarch64-unknown-linux-gnu>   but the target was not found.  Possible reasons include:
onnxruntime-aarch64-unknown-linux-gnu>     * There is a typo in the target name.
onnxruntime-aarch64-unknown-linux-gnu>     * A find_package call is missing for an IMPORTED target.
onnxruntime-aarch64-unknown-linux-gnu>     * An ALIAS target is missing.
onnxruntime-aarch64-unknown-linux-gnu> Call Stack (most recent call first):
onnxruntime-aarch64-unknown-linux-gnu>   onnxruntime_unittests.cmake:1258 (AddTest)
onnxruntime-aarch64-unknown-linux-gnu>   CMakeLists.txt:1693 (include)
onnxruntime-aarch64-unknown-linux-gnu>
onnxruntime-aarch64-unknown-linux-gnu> CMake Error at onnxruntime_unittests.cmake:57 (target_link_libraries):
onnxruntime-aarch64-unknown-linux-gnu>   Target "onnxruntime_global_thread_pools_test" links to:
onnxruntime-aarch64-unknown-linux-gnu>     GTest::gtest
onnxruntime-aarch64-unknown-linux-gnu>   but the target was not found.  Possible reasons include:
onnxruntime-aarch64-unknown-linux-gnu>     * There is a typo in the target name.
onnxruntime-aarch64-unknown-linux-gnu>     * A find_package call is missing for an IMPORTED target.
onnxruntime-aarch64-unknown-linux-gnu>     * An ALIAS target is missing.
onnxruntime-aarch64-unknown-linux-gnu> Call Stack (most recent call first):
onnxruntime-aarch64-unknown-linux-gnu>   onnxruntime_unittests.cmake:1289 (AddTest)
onnxruntime-aarch64-unknown-linux-gnu>   CMakeLists.txt:1693 (include)
onnxruntime-aarch64-unknown-linux-gnu>
onnxruntime-aarch64-unknown-linux-gnu> CMake Error at onnxruntime_unittests.cmake:1366 (target_link_libraries):
onnxruntime-aarch64-unknown-linux-gnu>   Target "onnxruntime_mlas_test" links to:
onnxruntime-aarch64-unknown-linux-gnu>     GTest::gtest
onnxruntime-aarch64-unknown-linux-gnu>   but the target was not found.  Possible reasons include:
onnxruntime-aarch64-unknown-linux-gnu>     * There is a typo in the target name.
onnxruntime-aarch64-unknown-linux-gnu>     * A find_package call is missing for an IMPORTED target.
onnxruntime-aarch64-unknown-linux-gnu>     * An ALIAS target is missing.
onnxruntime-aarch64-unknown-linux-gnu> Call Stack (most recent call first):
onnxruntime-aarch64-unknown-linux-gnu>   CMakeLists.txt:1693 (include)
onnxruntime-aarch64-unknown-linux-gnu>
onnxruntime-aarch64-unknown-linux-gnu> CMake Error at onnxruntime_unittests.cmake:57 (target_link_libraries):
onnxruntime-aarch64-unknown-linux-gnu>   Target "onnxruntime_customopregistration_test" links to:
onnxruntime-aarch64-unknown-linux-gnu>     GTest::gtest
onnxruntime-aarch64-unknown-linux-gnu>   but the target was not found.  Possible reasons include:
onnxruntime-aarch64-unknown-linux-gnu>     * There is a typo in the target name.
onnxruntime-aarch64-unknown-linux-gnu>     * A find_package call is missing for an IMPORTED target.
onnxruntime-aarch64-unknown-linux-gnu>     * An ALIAS target is missing.
onnxruntime-aarch64-unknown-linux-gnu> Call Stack (most recent call first):
onnxruntime-aarch64-unknown-linux-gnu>   onnxruntime_unittests.cmake:1549 (AddTest)
onnxruntime-aarch64-unknown-linux-gnu>   CMakeLists.txt:1693 (include)
onnxruntime-aarch64-unknown-linux-gnu>
onnxruntime-aarch64-unknown-linux-gnu> CMake Error at onnxruntime_unittests.cmake:57 (target_link_libraries):
onnxruntime-aarch64-unknown-linux-gnu>   Target "onnxruntime_logging_apis_test" links to:
onnxruntime-aarch64-unknown-linux-gnu>     GTest::gtest
onnxruntime-aarch64-unknown-linux-gnu>   but the target was not found.  Possible reasons include:
onnxruntime-aarch64-unknown-linux-gnu>     * There is a typo in the target name.
onnxruntime-aarch64-unknown-linux-gnu>     * A find_package call is missing for an IMPORTED target.
onnxruntime-aarch64-unknown-linux-gnu>     * An ALIAS target is missing.
onnxruntime-aarch64-unknown-linux-gnu> Call Stack (most recent call first):
onnxruntime-aarch64-unknown-linux-gnu>   onnxruntime_unittests.cmake:1638 (AddTest)
onnxruntime-aarch64-unknown-linux-gnu>   CMakeLists.txt:1693 (include)
onnxruntime-aarch64-unknown-linux-gnu>
onnxruntime-aarch64-unknown-linux-gnu> -- Generating done (0.2s)
onnxruntime-aarch64-unknown-linux-gnu> CMake Warning:
onnxruntime-aarch64-unknown-linux-gnu>   Manually-specified variables were not used by the project:
onnxruntime-aarch64-unknown-linux-gnu>     CMAKE_EXPORT_NO_PACKAGE_REGISTRY
onnxruntime-aarch64-unknown-linux-gnu>     PYBIND11_PYTHONLIBS_OVERWRITE
onnxruntime-aarch64-unknown-linux-gnu>     PYTHON_INCLUDE_DIR
onnxruntime-aarch64-unknown-linux-gnu>     PYTHON_SITE_PACKAGES
onnxruntime-aarch64-unknown-linux-gnu>
onnxruntime-aarch64-unknown-linux-gnu> CMake Generate step failed.  Build files cannot be regenerated correctly.

Additional context

Add any other context about the problem here.

Notify maintainers

@jonringer @puffnfresh @ck3d @cbourjau

Additionally CC recently committer and reviewer(sorry for bother): @autrimpo @SomeoneSerge

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

 - system: `"x86_64-linux"`
 - host os: `Linux 6.8.6-zen1, NixOS, 24.05 (Uakari), 24.05.20240419.5c24cf2`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.18.2`
 - channels(root): `"nixos, nixpkgs"`
 - nixpkgs: `/home/cryolitia/.nix-defexpr/channels/nixpkgs`

Add a :+1: reaction to issues you find important.

SomeoneSerge commented 5 months ago

Seems like we either need to find a way to disable tests at the cmake level, or we'll have to move gtest from nativeCheckInputs to buildInputs?

autrimpo commented 5 months ago

We already disable tests at cmake level with onnxruntime_BUILD_UNIT_TESTS for CUDA builds. Changing it from doCheck to false continues with the cross-compile for me on the current master (80368c5). It fails at the end:

onnxruntime-aarch64-unknown-linux-gnu> [100%] Linking CXX shared library libonnxruntime.so
onnxruntime-aarch64-unknown-linux-gnu> /build/source/onnxruntime/python/onnxruntime_pybind_ortvalue.cc:11:10: fatal error: numpy/arrayobject.h: No such file or directory
onnxruntime-aarch64-unknown-linux-gnu>    11 | #include <numpy/arrayobject.h>
onnxruntime-aarch64-unknown-linux-gnu>       |          ^~~~~~~~~~~~~~~~~~~~~
onnxruntime-aarch64-unknown-linux-gnu> compilation terminated.
onnxruntime-aarch64-unknown-linux-gnu> make[2]: *** [CMakeFiles/onnxruntime_pybind11_state.dir/build.make:132: CMakeFiles/onnxruntime_pybind11_state.dir/build/source/onnxruntime/python/onnxruntime_pybind_ortvalue.cc.o] Error 1onnxruntime-aarch64-unknown-linux-gnu> make[2]: *** Waiting for unfinished jobs....
onnxruntime-aarch64-unknown-linux-gnu> /build/source/onnxruntime/python/onnxruntime_pybind_mlvalue.cc:11:10: fatal error: numpy/arrayobject.h: No such file or directory
onnxruntime-aarch64-unknown-linux-gnu>    11 | #include <numpy/arrayobject.h>
onnxruntime-aarch64-unknown-linux-gnu>       |          ^~~~~~~~~~~~~~~~~~~~~
onnxruntime-aarch64-unknown-linux-gnu> compilation terminated.
onnxruntime-aarch64-unknown-linux-gnu> make[2]: *** [CMakeFiles/onnxruntime_pybind11_state.dir/build.make:104: CMakeFiles/onnxruntime_pybind11_state.dir/build/source/onnxruntime/python/onnxruntime_pybind_mlvalue.cc.o] Error 1
onnxruntime-aarch64-unknown-linux-gnu> /build/source/onnxruntime/python/onnxruntime_pybind_iobinding.cc:11:10: fatal error: numpy/arrayobject.h: No such file or directory
onnxruntime-aarch64-unknown-linux-gnu>    11 | #include <numpy/arrayobject.h>
onnxruntime-aarch64-unknown-linux-gnu>       |          ^~~~~~~~~~~~~~~~~~~~~
onnxruntime-aarch64-unknown-linux-gnu> compilation terminated.
onnxruntime-aarch64-unknown-linux-gnu> /build/source/onnxruntime/python/onnxruntime_pybind_state.cc:10:10: fatal error: numpy/arrayobject.h: No such file or directory
onnxruntime-aarch64-unknown-linux-gnu>    10 | #include <numpy/arrayobject.h>
onnxruntime-aarch64-unknown-linux-gnu>       |          ^~~~~~~~~~~~~~~~~~~~~
onnxruntime-aarch64-unknown-linux-gnu> compilation terminated.
onnxruntime-aarch64-unknown-linux-gnu> make[2]: *** [CMakeFiles/onnxruntime_pybind11_state.dir/build.make:90: CMakeFiles/onnxruntime_pybind11_state.dir/build/source/onnxruntime/python/onnxruntime_pybind_iobinding.cc.o] Error 1onnxruntime-aarch64-unknown-linux-gnu> make[2]: *** [CMakeFiles/onnxruntime_pybind11_state.dir/build.make:188: CMakeFiles/onnxruntime_pybind11_state.dir/build/source/onnxruntime/python/onnxruntime_pybind_state.cc.o] Error 1
onnxruntime-aarch64-unknown-linux-gnu> /build/source/onnxruntime/python/onnxruntime_pybind_sparse_tensor.cc:11:10: fatal error: numpy/arrayobject.h: No such file or directory
onnxruntime-aarch64-unknown-linux-gnu>    11 | #include <numpy/arrayobject.h>
onnxruntime-aarch64-unknown-linux-gnu>       |          ^~~~~~~~~~~~~~~~~~~~~
onnxruntime-aarch64-unknown-linux-gnu> compilation terminated.
onnxruntime-aarch64-unknown-linux-gnu> make[2]: *** [CMakeFiles/onnxruntime_pybind11_state.dir/build.make:174: CMakeFiles/onnxruntime_pybind11_state.dir/build/source/onnxruntime/python/onnxruntime_pybind_sparse_tensor.cc.o] Error 1
onnxruntime-aarch64-unknown-linux-gnu> /nix/store/hmjkyxmnr2nc6mbq3r8cwwg552j1vqaf-aarch64-unknown-linux-gnu-binutils-2.41/bin/aarch64-unknown-linux-gnu-ld: /nix/store/pl0jivzg5w84j8pbkpg9dmc3gg4rz7rd-protobuf-21.12/lib/libprotobuf-lite.so.3.21.12.0: error adding symbols: file in wrong format
onnxruntime-aarch64-unknown-linux-gnu> collect2: error: ld returned 1 exit status
onnxruntime-aarch64-unknown-linux-gnu> make[2]: *** [CMakeFiles/onnxruntime.dir/build.make:168: libonnxruntime.so.1.16.3] Error 1
onnxruntime-aarch64-unknown-linux-gnu> make[1]: *** [CMakeFiles/Makefile2:1277: CMakeFiles/onnxruntime.dir/all] Error 2
onnxruntime-aarch64-unknown-linux-gnu> make[1]: *** Waiting for unfinished jobs....
onnxruntime-aarch64-unknown-linux-gnu> make[1]: *** [CMakeFiles/Makefile2:1365: CMakeFiles/onnxruntime_pybind11_state.dir/all] Error 2
onnxruntime-aarch64-unknown-linux-gnu> make: *** [Makefile:166: all] Error 2
error: builder for '/nix/store/91h4j7fz7m8qknsfh841hrc9g2m3vcg7-onnxruntime-aarch64-unknown-linux-gnu-1.16.3.drv' failed with exit code 2;
       last 10 log lines:
       >       |          ^~~~~~~~~~~~~~~~~~~~~
       > compilation terminated.
       > make[2]: *** [CMakeFiles/onnxruntime_pybind11_state.dir/build.make:174: CMakeFiles/onnxruntime_pybind11_state.dir/build/source/onnxruntime/python/onnxruntime_pybind_sparse_tensor.cc.o] Error 1
       > /nix/store/hmjkyxmnr2nc6mbq3r8cwwg552j1vqaf-aarch64-unknown-linux-gnu-binutils-2.41/bin/aarch64-unknown-linux-gnu-ld: /nix/store/pl0jivzg5w84j8pbkpg9dmc3gg4rz7rd-protobuf-21.12/lib/libprotobuf-lite.so.3.21.12.0: error adding symbols: file in wrong format
       > collect2: error: ld returned 1 exit status
       > make[2]: *** [CMakeFiles/onnxruntime.dir/build.make:168: libonnxruntime.so.1.16.3] Error 1
       > make[1]: *** [CMakeFiles/Makefile2:1277: CMakeFiles/onnxruntime.dir/all] Error 2
       > make[1]: *** Waiting for unfinished jobs....
       > make[1]: *** [CMakeFiles/Makefile2:1365: CMakeFiles/onnxruntime_pybind11_state.dir/all] Error 2
       > make: *** [Makefile:166: all] Error 2
       For full logs, run 'nix log /nix/store/91h4j7fz7m8qknsfh841hrc9g2m3vcg7-onnxruntime-aarch64-unknown-linux-gnu-1.16.3.drv'.

I'll try playing around with the inputs a bit.

autrimpo commented 5 months ago

Just a quick update on this - i tried moving numpy and protobuf around to no avail, and their current location is probably where they should be according to the docs.

SomeoneSerge commented 5 months ago

How about explicitly adding host's -I${numpy}/${python.sitePackages}/numpy/core/include/ (et cetera) to NIX_CFLAGS_COMPILE

jonringer commented 3 months ago

Sorry, I was banned for 6 weeks, unable to respond.

This is a bit tricky, as python packages have two feet: one in the depsBuildHost and the other in the depsHostHost world. Not sure how python packages should look when both compiling against them and using them for c-style linking

jonringer commented 3 months ago

One initial thought, would be to add an additional dev output, which then should get picked up by the default build to add them as a NIX_CLFAGS_COMPILE, and then the headers should at least be available. Linking of pre-built artifacts is bit trickery (e.g. libnumpy.so if it exists).