Closed doug-walker closed 5 months ago
Thanks for the report @doug-walker, do you have an example of Linux distribution / version we could use to reproduce the issue (in Docker) or any recent version of the majors distro will do?
I went ahead and opened a PR #1933 to add generation of manylinux_2_28
Python wheels (alongside manylinux2014
), this can be tested from test PyPI with version 2.4.0dev0 https://test.pypi.org/project/opencolorio/. I'm not exactly sure how to confirm it is indeed using the new API, but it seems that this new wheel doesn't contain symbol related to the old string implementation (std::__cow_string::__cow_string*) which is probably a good sign.
I tried to reproduce the issue using some of the latest Linux distro available as Docker images but couldn't find any problem yet, I assume pip wouldn't allow the download of incompatible binary wheels anyway? I guess the main problem occurs when trying to use the dynamic library inside the wheel to link against code using a different ABI. Or when trying to import multiple packages interacting into a Python script with incompatible ABIs? To @doug-walker point about std::regex and the linked SO question, I guess some non trivial things may occur when trying to mix old and new ABI within a single process.
Tagging @JeanChristopheMorinPerso in case you have more clues, is this something that has been discussed in the ci wg?
Using manylinux_2_28
seems like the solution here. We could also force -D_GLIBCXX_USE_CXX11_ABI=1
in the CXXFLAGS
on top of using manylinux_2_28
just to be extra explicit. Though, I don't think it's necessary. Using manylinux_2_28
should be enough.
I assume pip wouldn't allow the download of incompatible binary wheels anyway?
Yep, installers (or pip) will only download wheels that is compatible the the installed glibc. Note that it's only looking at glibc's version and not if the wheel is compiled with the new CXX11 ABI or not. Pip only looks at the wheel name to determine that it should install, and the CXX ABI isn't encoded in the filename.
I'm not exactly sure how to confirm it is indeed using the new API
You can check if std::__cxx11
symbols are defined/present in the compiled libraries. See https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html.
I guess the main problem occurs when trying to use the dynamic library inside the wheel to link against code using a different ABI.
This would worry me since users shouldn't link against the library in the wheels (libOpenColorIO.so
).
Or when trying to import multiple packages interacting into a Python script with incompatible ABIs?
Probably this yes. I'm guessing here though since I don't deal with this kind of deal very often. What I know is that we shouldn't mix both ABIs.
is this something that has been discussed in the ci wg?
Not that I'm aware of, at least not this GitHub issue. The vfxplatform and wg-ci are somewhat related but are different.
You can check if std::__cxx11 symbols are defined/present in the compiled libraries. See https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html.
Yes that's what I thought, but the current manylinux2014 wheel already has lots of std::cxx11 symbols beside the std::cow_string::__cow_string* ones, I'm not exactly sure why that is the case. At least the latter cow symbols disappeared now so I bet we are good!
Thanks for chiming in JC, looks like we are on the good track :)
Thank you Remi and Jean-Christophe! Once these are merged, I will try to have the person who reported the crash test again with the new binding.
We verified that Remi's new wheels fix the crash that was reported. Thank you to Shunji Yokozawa and Wayne Arnold for their help!
The current Linux wheels for PyPI are based on manylinux2014 and are built on CentOS 7, which means they use the old libstdc++ ABI. Therefore they don't conform to the VFX Reference Platform for CY2023 and CY2024.
It turns out that even if a library does not use C++ std types in its API, it still may cause crashes. In the case of OCIO, this is because of the way std::regex is implemented in the standard library. (OCIO uses std::regex in several internal functions.)
This may cause crashes when the OCIO Python bindings are used in applications built with the new libstdc++ ABI to comply with the current VFX Platform, or in Python on recent versions of Linux, where the new ABI is now the default.
We should update the wheels to include one of the more recent manylinux options.