Closed astrojuanlu closed 1 year ago
@astrojuanlu Thanks for taking the lead on this!
The error is a bit baffling, could you tell me which compiler version is being used to compile pagmo?
@astrojuanlu The tbb version is also probably very relevant.
@astrojuanlu also, using yum to satisfy the dependencies is gonna end up using very old versions of the packages
Thanks for the quick response @bluescarni! From the logs, the versions seem to be
gcc-4.8.5-44.el7.x86_64
)tbb-devel-4.1-9.20130314.el7.x86_64
)@astrojuanlu also, using yum to satisfy the dependencies is gonna end up using very old versions of the packages
Yes. But, to my knowledge, to produce manylinux-compatible wheels, one has to stick to a given version of libc. I could potentially recompile newer versions of the dependencies, but for example using conda (as you do in your CI) is not an option.
Thanks for the quick response @bluescarni! From the logs, the versions seem to be
* gcc 4.8.5 (`gcc-4.8.5-44.el7.x86_64`) * tbb 4.1 (`tbb-devel-4.1-9.20130314.el7.x86_64`)
I don't think it's possible that GCC 4.8 is being used. pagmo uses C++17, which is supported since GCC 7. As far as I remember, in the manylinux 2014 image, an updated version of GCC is pre-installed.
@astrojuanlu also, using yum to satisfy the dependencies is gonna end up using very old versions of the packages
Yes. But, to my knowledge, to produce manylinux-compatible wheels, one has to stick to a given version of libc. I could potentially recompile newer versions of the dependencies, but for example using conda (as you do in your CI) is not an option.
Yes, when we were supporting pip we used to build all the dependencies on our own via the manylinux toolchain. @darioizzo created a new docker image based on the manylinux one with the dependencies pre-compiled in order to reduce the load on the CI, but as far as I know this custom image has not been maintained for years and it is likely unusable at this time.
I was making a mistake: the yum-installed gcc is indeed 4.8.5, but the one in the path is 10.2.1:
[root@5d0f067706a1 build]# which g++
/opt/rh/devtoolset-10/root/usr/bin/g++
[root@5d0f067706a1 build]# g++ --version
g++ (GCC) 10.2.1 20210130 (Red Hat 10.2.1-11)
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
I tried compiling pagmo2 locally inside a quay.io/pypa/manylinux2014_x86_64:2022-10-25-fbea779
Docker image without -j4
to see the traceback more clearly, and here it is:
[ 10%] Building CXX object CMakeFiles/pagmo.dir/src/island.cpp.o
/opt/rh/devtoolset-10/root/usr/bin/c++ -DBOOST_ALLOW_DEPRECATED_HEADERS -DNLOPT_DLL -Dpagmo_EXPORTS -I/pagmo2/include -I/pagmo2/build/include -isystem /usr/include/boost169 -isystem /usr/include/eigen3 -isystem /usr/include/coin -g -flto=auto -fno-fat-lto-objects -fPIC -fvisibility=hidden -fvisibility-inlines-hidden -fdiagnostics-color=auto -ftemplate-depth=1024 -fdiagnostics-show-template-tree -Wno-attributes -Waddress-of-packed-member -Wall -Wextra -Wnon-virtual-dtor -Wlogical-op -Wconversion -Wdeprecated -Wold-style-cast -Wdisabled-optimization -ftemplate-backtrace-limit=0 -fstack-protector-all -Wodr -Wsuggest-final-types -Wsuggest-final-methods -Wsuggest-override -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference -Wrestrict -Waligned-new -Wcast-align=strict -Wno-maybe-uninitialized -Wmismatched-tags -Wredundant-tags -pthread -std=c++17 -MD -MT CMakeFiles/pagmo.dir/src/island.cpp.o -MF CMakeFiles/pagmo.dir/src/island.cpp.o.d -o CMakeFiles/pagmo.dir/src/island.cpp.o -c /pagmo2/src/island.cpp
In file included from /usr/include/tbb/concurrent_queue.h:32,
from /pagmo2/src/island.cpp:57:
/usr/include/tbb/internal/_concurrent_queue_impl.h: In instantiation of ‘void tbb::strict_ppl::internal::micro_queue<T>::assign_and_destroy_item(void*, tbb::strict_ppl::internal::micro_queue<T>::page&, std::size_t) [with T = std::unique_ptr<pagmo::detail::task_queue>; tbb::strict_ppl::internal::micro_queue<T>::page = tbb::strict_ppl::internal::concurrent_queue_rep_base::page; std::size_t = long unsigned int]’:
/usr/include/tbb/internal/_concurrent_queue_impl.h:289:13: required from ‘bool tbb::strict_ppl::internal::micro_queue<T>::pop(void*, tbb::strict_ppl::internal::ticket, tbb::strict_ppl::internal::concurrent_queue_base_v3<T>&) [with T = std::unique_ptr<pagmo::detail::task_queue>; tbb::strict_ppl::internal::ticket = long unsigned int]’
/usr/include/tbb/internal/_concurrent_queue_impl.h:547:32: required from ‘bool tbb::strict_ppl::internal::concurrent_queue_base_v3<T>::internal_try_pop(void*) [with T = std::unique_ptr<pagmo::detail::task_queue>]’
/usr/include/tbb/concurrent_queue.h:116:38: required from ‘bool tbb::strict_ppl::concurrent_queue<T, A>::try_pop(T&) [with T = std::unique_ptr<pagmo::detail::task_queue>; A = tbb::cache_aligned_allocator<std::unique_ptr<pagmo::detail::task_queue> >]’
/pagmo2/src/island.cpp:256:29: required from here
/usr/include/tbb/internal/_concurrent_queue_impl.h:177:31: error: use of deleted function ‘std::unique_ptr<_Tp, _Dp>& std::unique_ptr<_Tp, _Dp>::operator=(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = pagmo::detail::task_queue; _Dp = std::default_delete<pagmo::detail::task_queue>]’
177 | *static_cast<T*>(dst) = from;
| ~~~~~~~~~~~~~~~~~~~~~~^~~~~~
In file included from /opt/rh/devtoolset-10/root/usr/include/c++/10/memory:83,
from /opt/rh/devtoolset-10/root/usr/include/c++/10/thread:44,
from /opt/rh/devtoolset-10/root/usr/include/c++/10/future:39,
from /pagmo2/src/island.cpp:34:
/opt/rh/devtoolset-10/root/usr/include/c++/10/bits/unique_ptr.h:469:19: note: declared here
469 | unique_ptr& operator=(const unique_ptr&) = delete;
| ^~~~~~~~
In file included from /usr/include/tbb/concurrent_queue.h:32,
from /pagmo2/src/island.cpp:57:
/usr/include/tbb/internal/_concurrent_queue_impl.h: In instantiation of ‘void tbb::strict_ppl::internal::micro_queue<T>::copy_item(tbb::strict_ppl::internal::micro_queue<T>::page&, std::size_t, const void*) [with T = std::unique_ptr<pagmo::detail::task_queue>; tbb::strict_ppl::internal::micro_queue<T>::page = tbb::strict_ppl::internal::concurrent_queue_rep_base::page; std::size_t = long unsigned int]’:
/usr/include/tbb/internal/_concurrent_queue_impl.h:261:18: required from ‘void tbb::strict_ppl::internal::micro_queue<T>::push(const void*, tbb::strict_ppl::internal::ticket, tbb::strict_ppl::internal::concurrent_queue_base_v3<T>&) [with T = std::unique_ptr<pagmo::detail::task_queue>; tbb::strict_ppl::internal::ticket = long unsigned int]’
/usr/include/tbb/internal/_concurrent_queue_impl.h:478:25: required from ‘void tbb::strict_ppl::internal::concurrent_queue_base_v3<T>::internal_push(const void*) [with T = std::unique_ptr<pagmo::detail::task_queue>]’
/usr/include/tbb/concurrent_queue.h:109:28: required from ‘void tbb::strict_ppl::concurrent_queue<T, A>::push(const T&) [with T = std::unique_ptr<pagmo::detail::task_queue>; A = tbb::cache_aligned_allocator<std::unique_ptr<pagmo::detail::task_queue> >]’
/pagmo2/src/island.cpp:288:50: required from here
/usr/include/tbb/internal/_concurrent_queue_impl.h:167:9: error: use of deleted function ‘std::unique_ptr<_Tp, _Dp>::unique_ptr(const std::unique_ptr<_Tp, _Dp>&) [with _Tp = pagmo::detail::task_queue; _Dp = std::default_delete<pagmo::detail::task_queue>]’
167 | new( &get_ref(dst,index) ) T(*static_cast<const T*>(src));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /opt/rh/devtoolset-10/root/usr/include/c++/10/memory:83,
from /opt/rh/devtoolset-10/root/usr/include/c++/10/thread:44,
from /opt/rh/devtoolset-10/root/usr/include/c++/10/future:39,
from /pagmo2/src/island.cpp:34:
/opt/rh/devtoolset-10/root/usr/include/c++/10/bits/unique_ptr.h:468:7: note: declared here
468 | unique_ptr(const unique_ptr&) = delete;
| ^~~~~~~~~~
make[2]: *** [CMakeFiles/pagmo.dir/src/island.cpp.o] Error 1
make[2]: Leaving directory `/pagmo2/build'
make[1]: *** [CMakeFiles/pagmo.dir/all] Error 2
make[1]: Leaving directory `/pagmo2/build'
make: *** [all] Error 2
(pasting screenshot for coloring)
(Happy to move this conversation to https://github.com/esa/pagmo2/issues if it helps)
Interestingly, the error seems to come from here https://github.com/esa/pagmo2/blob/e120b853592c13fdb40145633bfa22896058c3e7/src/island.cpp#L34 ... puzzling indeed.
Scratch that, it's in tbb.
@astrojuanlu I think this error is due to the old TBB version in use. Can you try to install a new version manually in your environment?
Fails: https://github.com/oneapi-src/oneTBB/issues/950 getting deeper into the rabbit hole, it seems...
@astrojuanlu sorry posted on the TBB issue while I meant to post here... Copying the message below.
Can you try perhaps with a stable version rather than the git head?
The conda package for pagmo compiled fine with this TBB version:
(TBB version 2021.6.0)
@astrojuanlu you may still find use for our old docker file https://github.com/esa/manylinux_x86_64_with_deps/blob/master/Dockerfile2010
With the specific instructions to compile all necessary deps (as of a few years back)
Looks like -DCMAKE_CXX_FLAGS=-DTBB_ALLOCATOR_TRAITS_BROKEN
did the trick for TBB (https://github.com/oneapi-src/oneTBB/issues/950#issuecomment-1303608144). Pushed a new commit, please approve the workflows.
Nice, it compiled pagmo2! 🎉 Now it failed because the headers of Python 3 are not available, but this is promising I think. It needed half an hour though, which is a good indication that we should create a Docker image with the precompiled dependencies.
I'll continue with this in the coming days.
I think if you want to create a docker image with deps and upload it to the pagmo docker you could just add it to the repo linked above via a new dockerfime (and remove the old ones)?
Nice, it compiled pagmo2! 🎉 Now it failed because the headers of Python 3 are not available, but this is promising I think. It needed half an hour though, which is a good indication that we should create a Docker image with the precompiled dependencies.
I'll continue with this in the coming days.
Nice progress!
Regarding the precompiled docker image, I am ok with it if someone volunteers to keep it up to date going forward (meaning essentially that the docker image should regularly be rebuilt in sync with upstream manylinux 2014).
Otherwise, it's better IMO to just use the vanilla manylinux image and accept the runtime cost of rebuilding all dependencies each time. Better this than having a custom image that bitrots over time.
After a few iterations I didn't manage to get a helpful answer for https://github.com/oneapi-src/oneTBB/issues/950, so for now until I have a better answer I'll keep the -DCMAKE_CXX_FLAGS=-DTBB_ALLOCATOR_TRAITS_BROKEN
trick. Will try to push this a bit more in the coming days.
The current blocker is that CMake is not finding Python 3 in the image.
I think I understood that the Docker image should only contain the dependencies, but I'm hitting a wall trying to understand how to fit the bespoke pygmo2 build process into cibuildwheel. I opened a discussion upstream seeking for help: https://github.com/pypa/cibuildwheel/discussions/1368
@astrojuanlu have you looked into https://scikit-build.readthedocs.io/en/latest/skbuild.html ? This is a bridge between CMake and the Python setup.py
machinery, and in my experience it works decently well.
Indeed, I was aware of it but was hoping that I could avoid rewriting pygmo2 build system. Looks like it's the most sensible path forward though according to the feedback I was given, so I will give it a try.
Indeed, I was aware of it but was hoping that I could avoid rewriting pygmo2 build system. Looks like it's the most sensible path forward though according to the feedback I was given, so I will give it a try.
But the point of using skbuild is precisely that CMake is used as the build system and you need a handful (<20 IME) of lines in setup.py to make it work...
I managed to build manylinux wheels locally for Python 3.6, 3.7, 3.8, 3.9, and 3.10 using cibuildwheel
and this branch. Took 11 minutes on my computer. The file size is too big to attach them here, so here's the output of auditwheel
on the 3.10 one as well as the bundled libs for proof:
juanlu@valinor ~/P/A/pygmo2 (build-wheels-ci)> auditwheel show wheelhouse/pygmo-2.18.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
pygmo-2.18.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
is consistent with the following platform tag:
"manylinux_2_17_x86_64".
The wheel references external versioned symbols in these
system-provided shared libraries: libdl.so.2 with versions
{'GLIBC_2.2.5'}, libgcc_s.so.1 with versions {'GCC_3.0', 'GCC_4.3.0',
'GCC_3.3.1', 'GCC_3.3', 'GCC_4.2.0', 'GCC_3.4'}, libc.so.6 with
versions {'GLIBC_2.16', 'GLIBC_2.10', 'GLIBC_2.2.5', 'GLIBC_2.14',
'GLIBC_2.8', 'GLIBC_2.3', 'GLIBC_2.17', 'GLIBC_2.7', 'GLIBC_2.11',
'GLIBC_2.4', 'GLIBC_2.3.4', 'GLIBC_2.3.2', 'GLIBC_2.6'},
libstdc++.so.6 with versions {'GLIBCXX_3.4.14', 'GLIBCXX_3.4.5',
'GLIBCXX_3.4.9', 'CXXABI_1.3', 'CXXABI_1.3.5', 'GLIBCXX_3.4.19',
'GLIBCXX_3.4.15', 'GLIBCXX_3.4.11', 'CXXABI_1.3.2', 'GLIBCXX_3.4',
'CXXABI_1.3.3', 'GLIBCXX_3.4.18'}, libpthread.so.0 with versions
{'GLIBC_2.3.4', 'GLIBC_2.2.5'}, librt.so.1 with versions
{'GLIBC_2.2.5'}, libm.so.6 with versions {'GLIBC_2.2.5'},
libquadmath-96973f99.so.0.0.0 with versions {'QUADMATH_1.0'},
libgfortran-91cc3cb1.so.3.0.0 with versions {'GFORTRAN_1.4',
'GFORTRAN_1.0'}, libgomp-a34b3233.so.1.0.0 with versions {'GOMP_1.0',
'OMP_1.0'}
This constrains the platform tag to "manylinux_2_17_x86_64". In order
to achieve a more compatible tag, you would need to recompile a new
wheel from source on a system with earlier versios of these
libraries, such as a recent manylinux image.
juanlu@valinor ~/P/A/pygmo2 (build-wheels-ci)> unzip -l wheelhouse/pygmo-2.18.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl | grep libs
0 2022-12-07 19:12 pygmo.libs/
4278408 2022-12-07 19:12 pygmo.libs/libtbb-f9968f44.so.12.7
583880 2022-12-07 19:12 pygmo.libs/libscotch-33f856f8.so.0.2
331448 2022-12-07 19:12 pygmo.libs/libboost_serialization-d59e3d34.so.1.69.0
879912 2022-12-07 19:12 pygmo.libs/libnlopt-ab075a7a.so.0.11.1
25456 2022-12-07 19:12 pygmo.libs/libesmumps-22790425.so.0.2
168192 2022-12-07 19:12 pygmo.libs/libgomp-a34b3233.so.1.0.0
1259664 2022-12-07 19:12 pygmo.libs/libgfortran-91cc3cb1.so.3.0.0
101688 2022-12-07 19:12 pygmo.libs/libpord-5-bdd48660.3.so
2221864 2022-12-07 19:12 pygmo.libs/libdmumps-5-6cd18ef7.3.so
247608 2022-12-07 19:12 pygmo.libs/libquadmath-96973f99.so.0.0.0
51384 2022-12-07 19:12 pygmo.libs/libmpiseq-5-904cc5e3.3.so
346784 2022-12-07 19:12 pygmo.libs/libmumps_common-5-eee566a3.3.so
16840 2022-12-07 19:12 pygmo.libs/libscotcherr-1b407653.so.0.2
483104 2022-12-07 19:12 pygmo.libs/libmetis-abd0e9f4.so.0
123159144 2022-12-07 19:12 pygmo.libs/libpagmo-e85e85fa.so.8.0
70992 2022-12-07 19:12 pygmo.libs/libbz2-a273e504.so.1.0.6
3062344 2022-12-07 19:12 pygmo.libs/libipopt-b07a9989.so.3.13.0
37325000 2022-12-07 19:12 pygmo.libs/libopenblas-r0-f650aae0.3.3.so
21576 2022-12-07 19:12 pygmo.libs/libscotchmetis-47ddb1b8.so.0.2
This is the command I used:
CIBW_BEFORE_BUILD='python -m pip install -v pybind11' CIBW_BUILD="cp3{6,7,8,9,10}-manylinux_x86_64" CIBW_ENVIRONMENT='CMAKE_ARGS="-DBOOST_INCLUDEDIR=/usr/include/boost169 -DBOOST_LIBRARYDIR=/usr/lib64/boost169" PIP_VERBOSE=1' CIBW_MANYLINUX_X86_64_IMAGE=astrojuanlu/manylinux2014_x86_64_pygmo2_deps:dev cibuildwheel --platform linux
As you can see, I'm using a Docker image astrojuanlu/manylinux2014_x86_64_pygmo2_deps:dev
that holds all the dependencies. I'm committing its definition as well, inside docker/Dockerfile
.
There are still CI failures because gh-113 is incomplete. That one should be finished first, although I found a few obscure test failures that blocked me.
Honestly I'm in full "I hate CMake" mode at this point - the biggest hurdle has been getting it to find the stuff. The last barrier I was hitting was with
find_package(Python3 REQUIRED COMPONENTS Interpreter Development)
which, interestingly, was succeeding for Python 3.6, but failing for everything else with this error:
CMake Error at /opt/_internal/pipx/venvs/cmake/lib/python3.9/site-packages/cmake/data/share/cmake-3.24/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
Could NOT find Python3 (missing: Python3_LIBRARIES Development
Development.Embed) (found version "3.7.15")
The documentation was not very helpful (what are Development.Module
and Development.Embed
anyway?) and it's not clear which sub-components are needed to build Pygmo for the uninitiated. As a result, I constrained the search a bit:
diff --git a/CMakeLists.txt b/CMakeLists.txt
index e80329e..9b06181 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -137,7 +137,7 @@ if(${pagmo_VERSION} VERSION_LESS ${_PYGMO_MIN_PAGMO_VERSION})
endif()
# python.
-find_package(Python3 REQUIRED COMPONENTS Interpreter Development)
+find_package(Python3 REQUIRED COMPONENTS Interpreter Development.Module)
message(STATUS "Python3 interpreter: ${Python3_EXECUTABLE}")
message(STATUS "Python3 installation directory: ${Python3_SITEARCH}")
and now it seems to be working, but it would be nice that somebody more knowledgeable has a look at the implications of this.
I'm staying away from this at least for a few days to catch up with other stuff.
Closing this as its now solved in #117.
This uses cibuildwheel to build Python wheels and source distributions, and upload them to PyPI if proper credentials are given. cibuildwheel is the standard for building complicated wheels on CI for Python packages, and the resulting wheels are manylinux compliant. I'm using the
manylinux2014
standard because it has the newest images, to minimize the amount of packages I need to compile.I got pretty far, but now the compilation of Pagmo fails on
island.cpp.o
:If you could lend a hand with this, I'm happy to keep pushing for the PR, at least until we find some other bottleneck.
Notice that I didn't try to make the code pretty or reusable: essentially I took the instructions from
tools/
and rewrote them for CIBW usingyum
instead of conda.Fix gh-102.