ceres-solver / ceres-solver

A large scale non-linear optimization library
http://ceres-solver.org/
Other
3.66k stars 1.01k forks source link

The program crashes when executing solve() #1041

Open CChyyyyyy opened 4 months ago

CChyyyyyy commented 4 months ago

When I executed the solve() interface, a double free error occurred in the program, causing the program to crash. The stack is as shown below. What is the reason for this?

Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139762692249152) at ./nptl/pthread_kill.c:44
44        ./nptl/pthread_kill.c: No such file or directory.
[Current thread is 1 (Thread 0x7f1d099fd640 (LWP 388526))]
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139762692249152) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=139762692249152) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=139762692249152, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007f1d2ce7f476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007f1d2ce657f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007f1d2cec6676 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f1d2d018b77 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#6  0x00007f1d2ceddcfc in malloc_printerr (str=str@entry=0x7f1d2d01b790 "double free or corruption (out)") at ./malloc/malloc.c:5664
#7  0x00007f1d2cedfe70 in _int_free (av=0x7f1d2d056c80 <main_arena>, p=0x3d50ef0, have_lock=<optimized out>) at ./malloc/malloc.c:4588
#8  0x00007f1d2cee2453 in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3391
#9  0x00000000012102de in ceres::internal::EigenSparseCholeskyTemplate<Eigen::SimplicialLDLT<Eigen::SparseMatrix<double, 0, int>, 2, Eigen::NaturalOrdering<int> > >::~EigenSparseCholeskyTemplate() ()
#10 0x000000000110a633 in ceres::internal::SparseNormalCholeskySolver::~SparseNormalCholeskySolver() ()
#11 0x000000000110a6c9 in ceres::internal::SparseNormalCholeskySolver::~SparseNormalCholeskySolver() ()
#12 0x00000000010179a6 in ceres::Solver::Solve(ceres::Solver::Options const&, ceres::Problem*, ceres::Solver::Summary*) ()
#13 0x0000000001018da9 in ceres::Solve(ceres::Solver::Options const&, ceres::Problem*, ceres::Solver::Summary*) ()
sandwichmaker commented 4 months ago

This is a crash inside eigen's sparse cholesky factorization routine.

Beyond that it's hard to say why this crash is happening without knowing more about the structure of the problem you are solving.

Do you have a piece of code that replicates this crash?

On Tue, Jan 9, 2024, 6:58 PM chyyyyyy @.***> wrote:

When I executed the solve() interface, a double free error occurred in the program, causing the program to crash. The stack is as shown below. What is the reason for this?

Program terminated with signal SIGABRT, Aborted.

0 __pthread_kill_implementation (no_tid=0, signo=6,

threadid=139762692249152) at ./nptl/pthread_kill.c:44 44 ./nptl/pthread_kill.c: No such file or directory. [Current thread is 1 (Thread 0x7f1d099fd640 (LWP 388526))] (gdb) bt

0 __pthread_kill_implementation (no_tid=0, signo=6,

threadid=139762692249152) at ./nptl/pthread_kill.c:44

1 https://github.com/ceres-solver/ceres-solver/issues/1

__pthread_kill_internal (signo=6, threadid=139762692249152) at ./nptl/pthread_kill.c:78

2 https://github.com/ceres-solver/ceres-solver/issues/2

__GI___pthread_kill (threadid=139762692249152, @.***=6) at ./nptl/pthread_kill.c:89

3 https://github.com/ceres-solver/ceres-solver/issues/3

0x00007f1d2ce7f476 in __GI_raise @.***=6) at ../sysdeps/posix/raise.c:26

4 https://github.com/ceres-solver/ceres-solver/issues/4

0x00007f1d2ce657f3 in __GI_abort () at ./stdlib/abort.c:79

5 https://github.com/ceres-solver/ceres-solver/issues/5

0x00007f1d2cec6676 in __libc_message @.=do_abort, @.=0x7f1d2d018b77 "%s\n") at ../sysdeps/posix/libc_fatal.c:155

6 https://github.com/ceres-solver/ceres-solver/issues/6

0x00007f1d2ceddcfc in malloc_printerr @.***=0x7f1d2d01b790 "double free or corruption (out)") at ./malloc/malloc.c:5664

7 https://github.com/ceres-solver/ceres-solver/issues/7

0x00007f1d2cedfe70 in _int_free (av=0x7f1d2d056c80 , p=0x3d50ef0, have_lock=) at ./malloc/malloc.c:4588

8 https://github.com/ceres-solver/ceres-solver/issues/8

0x00007f1d2cee2453 in __GI___libc_free (mem=) at ./malloc/malloc.c:3391

9 https://github.com/ceres-solver/ceres-solver/issues/9

0x00000000012102de in ceres::internal::EigenSparseCholeskyTemplate<Eigen::SimplicialLDLT<Eigen::SparseMatrix<double, 0, int>, 2, Eigen::NaturalOrdering > >::~EigenSparseCholeskyTemplate() ()

10 https://github.com/ceres-solver/ceres-solver/issues/10

0x000000000110a633 in ceres::internal::SparseNormalCholeskySolver::~SparseNormalCholeskySolver() ()

11 https://github.com/ceres-solver/ceres-solver/issues/11

0x000000000110a6c9 in ceres::internal::SparseNormalCholeskySolver::~SparseNormalCholeskySolver() ()

12 https://github.com/ceres-solver/ceres-solver/issues/12

0x00000000010179a6 in ceres::Solver::Solve(ceres::Solver::Options const&, ceres::Problem, ceres::Solver::Summary) ()

13 https://github.com/ceres-solver/ceres-solver/issues/13

0x0000000001018da9 in ceres::Solve(ceres::Solver::Options const&, ceres::Problem, ceres::Solver::Summary) ()

— Reply to this email directly, view it on GitHub https://github.com/ceres-solver/ceres-solver/issues/1041, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANCABKQQCBXPTMBEXDQF63YNX7UHAVCNFSM6AAAAABBUCMJASVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA3TGNBZGM4TGOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

CChyyyyyy commented 4 months ago

Thank you for your reply. This is a bug with a very low probability of occurrence. I currently cannot provide code to reproduce it stably. I am checking my code, but I have not found any problems. Do you have any troubleshooting ideas? By the way, I would like to ask why ~SparseNormalCholeskySolver() calls ~SparseNormalCholeskySolver() in the stack

#10 0x000000000110a633 in ceres::internal::SparseNormalCholeskySolver::~SparseNormalCholeskySolver() ()
#11 0x000000000110a6c9 in ceres::internal::SparseNormalCholeskySolver::~SparseNormalCholeskySolver() ()
sandwichmaker commented 4 months ago

The destructor calling the destructor again does not seem right. This looks like some kind of linking problem, this should not happen. Are you sure that you are not mixing multiple versions of ceres solver in your system?