boostorg / python

Boost.org python module
http://boostorg.github.io/python
Boost Software License 1.0
469 stars 201 forks source link

Eigen + Boost.Python + gcc-7 = crash #232

Open matpen opened 6 years ago

matpen commented 6 years ago

I am using Boost.Python to provide bindings for a scientific app which uses Eigen for linear algebra. I am experiencing a segfault with some Eigen types, and I managed to isolate the code that causes the crash.

Please consider the attached script which:

As you can see from the comments at the bottom of the script, the test builds, but leads to a crash when run. This does not happen with gcc-5.

Interestingly, I can reproduce the problem by using Matrix2d and Vector2d; but it does not happen when using Matrix3d or Vector3d (this can be easily verified by replacing in the attached script).

Now, while the problem can be in any of the 3 components (Eigen, Boost.Python, gcc-7), I decided to post an issue here first because the project we are building uses Eigen extensively, and I never experienced any problem on "C++ side", even while building with gcc-7.

I hope that someone with experience with the Boost.Python internals can help me shed some light on this.

Some more info about the build system, in case it might be useful:

stefanseefeld commented 6 years ago

Thanks for the report. So for avoidance of doubt: you tested with gcc-5, gcc-7, and gcc-8, but you only observe the crash with gcc-7 ? You mention different Boost versions. Does it crash with all of them ? If the crash happens with one compiler version only, perhaps it would be useful to try different optimization levels, to see whether they affect the behaviour. (The bug could be caused by undefined behaviour, which often trigger crashes only with aggressive optimization turned on.) Thanks,

matpen commented 6 years ago

Hi @stefanseefeld, thanks for the quick reaction.

So for avoidance of doubt: you tested with gcc-5, gcc-7, and gcc-8, but you only observe the crash with gcc-7 ?

The crash happens with both gcc-7 and gcc-8, while building with -std=c++17 or -std=c++11.

You mention different Boost versions. Does it crash with all of them ?

That is correct: it crashes both with boost 1.58 and boost 1.66.

Sorry for not being completely clear in the issue description: You can see the full list of combinations I tried at the bottom of the attached script.

If the crash happens with one compiler version only, perhaps it would be useful to try different optimization levels, to see whether they affect the behaviour.

I was so busy fiddling with the various compiler versions, that I did not even think about this: good call. Here is what happens when changing the -On optimization flag:

What is your advice? Does it look more like a compiler problem, or a boost problem? I did see some warnings about the register keyword being deprecated (in the python 2.7 lib, not boost) but I cannot reproduce them right now...

Anyway, if you have further ideas about things to try, feel free to share: the test is ready to be run from my IDE, so it will be very quick to experiment.

stefanseefeld commented 6 years ago

Hi @matpen , I haven't had the time yet to look at your script (or code). But the general pattern of "works with some compilers but not with others" sounds familiar, so "undefined behaviour" was my first thought. (And the fact that the behaviour depends on optimization level further reinforces that suspicion. gcc-8 may just to more optimization even with -O1, triggering the crash even earlier than gcc-7.) I would thus suspect the problem to be in the code (either yours or Boost.Python), rather than the compiler. Note that nowadays there are a number of quite powerful Free tools to validate source code against known sources for undefined behaviour. For example: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html. You may want to give that a try.

matpen commented 6 years ago

That was another good suggestion. I found the same tool (UndefinedBehaviorSanitizer) in gcc, and activated it with -fsanitize=undefined and -lubsan.

At runtime, I get the error constructor call on misaligned address, see output here. As mentioned earlier, this does not happen with Vector3d (but only with Vector2d).

As a final test, I tried the very same code in "C++ only": no such warning are issued, and no crash. This would exclude problems with the compiler, with Eigen or with my code.

My guess at this point is that, due to some internal logic, the wrapping python object for Vector2d is generated in a wrong way. Appreciate any further ideas.

stefanseefeld commented 6 years ago

I'm glad to see we are making progress narrowing down possible causes of the issue. I'll try to look into this as soon as possible.

matpen commented 6 years ago

I am thankful for this! If you think I can be of any help, just let me know!

matpen commented 6 years ago

I am thankful for this! If you think I can be of any help, just let me know!

Thinking of which, some instructions to quickly reproduce my tests might be useful:

Bindings code:

#include <Eigen/Core>
#include <boost/python.hpp>

BOOST_PYTHON_MODULE(test)
{
    boost::python::class_<Eigen::Vector2d>("Vector2d")
        .def("__neg__", +[](const Eigen::Vector2d& self) -> Eigen::Vector2d { return -1 * self; })
        .def("Zero", +[]() -> Eigen::Vector2d { return Eigen::Vector2d::Zero(); })
        .staticmethod("Zero")
    ;
}

Python code

import test
zero = test.Vector2d.Zero()
matrix = -zero # the call to __neg__() crashes

Equivalent C++ code (works):

#include <Eigen/Core>

int main()
{
    Eigen::Vector2d zero = Eigen::Vector2d::Zero();
    Eigen::Vector2d matrix = -zero;
}
stefanseefeld commented 3 years ago

I merged some changes to make sure by-value-stored objects use the correct alignment. It would be great if you could try the current develop branch to see whether that resolves the issues you were observing.