CMA-ES / libcmaes

libcmaes is a multithreaded C++11 library with Python bindings for high performance blackbox stochastic optimization using the CMA-ES algorithm for Covariance Matrix Adaptation Evolution Strategy
Other
321 stars 78 forks source link

Segmentation fault (core dumped) #183

Open Boyuanyu01 opened 6 years ago

Boyuanyu01 commented 6 years ago

I have successfully compiled the libcmaes and passed all the tests by typing './test_functions -all'. However, when I compiled the sample-code.cc in the examples fold by following command:

g++ -fopenmp -std=gnu++11 -I /usr/local/include/eigen3 -I /home/boyuan/include/libcmaes -L/home/boyuan/lib -o sample_code sample-code.cc -lcmaes

and run the sample_code by ./sample_code. I received 'segmentation fault(core dump)'.

By using gdb, the detailed information is:

Reading symbols from ./sample_code...(no debugging symbols found)...done. (gdb) run Starting program: /home/boyuan/libcmaes/examples/sample_code [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault. Eigen::internal::dense_assignment_loop<Eigen::internal::generic_dense_assignment_kernel<Eigen::internal::evaluator<Eigen::Matrix<double, -1, 1, 0, -1, 1> >, Eigen::internal::evaluator<Eigen::CwiseNullaryOp<Eigen::internal::scalar_constant_op, Eigen::Matrix<double, -1, 1, 0, -1, 1> > >, Eigen::internal::assign_op<double, double>, 0>, 3, 0>::run ( kernel=) at /usr/local/include/eigen3/Eigen/src/Core/AssignEvaluator.h:416 416 kernel.template assignPacket<dstAlignment, srcAlignment, PacketType>(index); (gdb) Quit

I wonder what causes this issue. Thanks.

beniz commented 6 years ago

Let me see if I understand this correctly: when you are compiling the binary file with libcmaes makefiles, it works fine and they run fine. Whenever you are using your own compilation command line your program is crashing, is this correct ?

Boyuanyu01 commented 6 years ago

Yes, this is exactly my issue.

beniz commented 6 years ago

Try adding -mfma to your compile line:

g++ -fopenmp -std=gnu++11 -mfma -I/usr/local/include/eigen3 -I /home/boyuan/include/libcmaes -L/home/boyuan/lib -o sample_code sample-code.cc -lcmaes

My guess is that since we compile the lib with FMA (https://en.wikipedia.org/wiki/FMA_instruction_set) as default, you need to compile with the same flags.

Note that the flags used for building libcmaes can be retrieved from the build output.

It'd be better if we could pass these flags automatically via pkg-config for instance. Not sure how to do this.

Please let me know whether this solves your issue.

Boyuanyu01 commented 6 years ago

No. After adding -mfma, I still encountered the same issue.

beniz commented 6 years ago

Try the other flags then, by looking at the building trace.

colddie commented 4 years ago

Same issue here. Not sure how to solve this.

beniz commented 4 years ago

Hi, this was due to optimization flags used to build the library by default. Share your full build trace maybe.

colddie commented 4 years ago

Here is what I tried:

g++ -DHAVE_CONFIG_H -I/home/tsun/bin/libcmaes-master/install/include/libcmaes \ -I/home/tsun/bin/eigen-eigen-323c052e1731/install/include/eigen3 \ -Wall -Wextra -g -O3 -mavx -mfma -fopenmp -g -O2 -o testOptim testOptim.cxx \ -L/home/tsun/bin/libcmaes-master/install/lib -lcmaes

I copied several flags from the ones used when compiling the libcmaes.so. But still when I run the compiled "testOptim", I got the following:


cmaes...


INFO - CMA-ES / dim=10 / lambda=10 / sigma0=0.1 / mu=5 / mueff=3.41477 / c1=0.015255 / cmu=0.0231675 / tpa=0 / threads=80 INFO - iter=0 / evals=0 / f-value=nan / sigma=0.1 / last_iter=0 Segmentation fault (core dumped)

And Valgrind seems failed to receive a valid instruction so I cannot tell what happened exactly.

Tao

beniz commented 4 years ago

Hi, try removing the mfma and mavx flags.

colddie commented 4 years ago

That did not help. This time I got more information when building an example by myself. Not sure if this was the actual reason that caused my previous trouble.

INFO - stopping criteria tolHistFun => frange=9.07430966327435e-13 best solution: best solution => f-value=2.12494800481354e-15 / fevals=2780 / sigma=1.33002959755518e-07 / iter=278 / elaps=19ms / x=-2.28450344450647e-08 7.24833162905444e-09 -4.02401057129716e-10 -1.19664793487404e-08 4.74314295740099e-09 9.03401322328255e-09 1.89276599964569e-08 2.94440526013685e-08 -8.19134245001643e-09 3.27690437732518e-09 optimization took 0.019 seconds Error in `./testOptim': double free or corruption (out): 0x000000000066b040 ======= Backtrace: ========= /lib64/libc.so.6(+0x81499)[0x2aaaaf2f4499] /home/tsun/bin/libcmaes-master/install/lib/libcmaes.so.0(_ZN8libcmaes12CMASolutionsD2Ev+0x19)[0x2aaaab337e29] ./testOptim[0x410aaa] /lib64/libc.so.6(__libc_start_main+0xf5)[0x2aaaaf295445] ./testOptim[0x410d31] ======= Memory map: ========

More similar to this closed issue now (https://github.com/beniz/libcmaes/issues/190), except removing -mfma does not help.

beniz commented 4 years ago

Could you share your full build log of libcmaes, as well any custom code you are running, and building log as well ? It's not clear to me whether the error you are showing comes from your custom code or one of the examples from libcmaes ?

colddie commented 4 years ago

Hi again,

I think now everything works fine. The problem was due to that I built the Eigen3 without -std=c++11 support. Not sure why only using download header files not working. Thank you for your response.

yasamoka commented 4 years ago

Hi again,

I think now everything works fine. The problem was due to that I built the Eigen3 without -std=c++11 support. Not sure why only using download header files not working. Thank you for your response.

How did you build Eigen3? Eigen3 is a pure template library. http://eigen.tuxfamily.org/index.php?title=Main_Page

I am facing segmentation faults and I'm trying to understand what your solution to the problem was.

Thank you in advance!

EDIT:

Was compiling against Eigen 3.3.7 in system. Compiled against 3.3.90 cloned from repo. Debug mode was now working. Release was still causing a segmentation fault.

I have OpenMP code in my program, not sure if that's causing the necessary inclusion of -favx. Adding the flag made no difference to executable size, hence it was included from the start.

Valgrind's output when compiling with -fopenmp -g -mavx:

==17719== Process terminating with default action of signal 11 (SIGSEGV)
==17719==  General Protection Fault
==17719==    at 0x121655: _mm256_load_pd (avxintrin.h:862)

Memory alignment issue. Had to re-compile libcmaes in Release mode with -favx and it finally worked!