heal-research / operon

C++ Large Scale Genetic Programming
https://operongp.readthedocs.io
MIT License
144 stars 26 forks source link

double free or corruption (out) #38

Closed foolnotion closed 1 month ago

foolnotion commented 1 month ago

This still applies for me even when using the flags suggested. Things are fine if there is just one thread though Executor executor(1); The errors seem to be changing though, sometimes I get corrupted double-linked list or Signal: SIGSEGV (Segmentation fault) I think it has to do with taskflow and ForwardPass on JacRev.

Originally posted by @ealione in https://github.com/heal-research/operon/issues/25#issuecomment-2387223596

foolnotion commented 1 month ago

Hi, could you post more information to help debug this issue:

operon revision
output of e.g. operon_gp --version or operon_nsgp --version
CPU model

Any additional details or a small script/code snippet to help reproduce this issue is appreciated.

Originally posted by @foolnotion in https://github.com/heal-research/operon/issues/25#issuecomment-2387711808

ealione commented 1 month ago

VERSION: operon rev. 28ac41b Linux-6.8.0-45-generic x86_64, timestamp 1979-12-31T16:00:00Z single-precision build using eigen 3.4.0, ceres n/a, taskflow 3.7.0 compiler: Clang 18.1.6, flags: -g -O3 -DNDEBUG

CPU: AMD Ryzen 7 7800X3D

I was testing a script similar to https://github.com/heal-research/operon/blob/main/test/source/implementation/poisson_regression.cpp

ealione commented 1 month ago

Apologies but this was my mistake, I had placed std::cout << trace in ForwardPass and forgot about it. I re-installed a clean copy and it is OK.

ealione commented 1 month ago

I'm not sure why me trying to print that value caused the issue but Operon works great!

foolnotion commented 1 month ago

Operon uses a number of dependencies which have to be compiled with the exact same flags in order to avoid errors like the ones you encountered. Could you try to build with optimizations turned on:

cmake -S . -B build-linux --preset build-linux -DCMAKE_BUILD_TYPE=Release
cmake --build build-linux -j
./build-linux/cli/operon_gp --version
operon rev. 9a15fba Release Linux-6.11.1 x86_64, timestamp 1980-01-01T01:00:00Z
single-precision build using eigen 3.4.0, ceres n/a, taskflow 3.8.0
compiler: Clang 18.1.6, flags: -g -O3 -DNDEBUG -Wall -Wextra -Werror -pedantic -fsized-deallocation -fno-math-errno -march=x86-64-v3
gkronber commented 1 month ago

I'm not sure why me trying to print that value caused the issue but Operon works great!

Operon uses multiple threads. ForwardPass runs concurrently on multiple threads.

As you noted, the problem does not occur in a single threaded configuration. Without checking it, I'd say writing zu cout is not thread-safe and produced a race-condition.

foolnotion commented 1 month ago

The interpreter is not thread-safe and should never be reused between threads. It is a very light object which should be initialized anew for every tree.