google / or-tools

Google's Operations Research tools:
https://developers.google.com/optimization/
Apache License 2.0
11.14k stars 2.11k forks source link

Incompatibility with Easylogging++ - segmentation fault #3298

Closed ceandrade closed 1 year ago

ceandrade commented 2 years ago

Version: 9.3 both pre-compiled packages and compiled source code Language: C++, GCC 10, Clang 12.0.5 OS: MacOS X 11.6.5 , Mint Linux 20.4, CentOS 7. Using: CP-SAT

I have a very large project/codebase that uses Easylogging++ as the main logging tool. WHen using Google OR-Tools and compiling with optimization flags (-O2 or -O3), the execution results in a segmentation fault.

Suppose we have this very simple code:

#include "third_part/easylogging++/easylogging++.h"
#include "ortools/sat/cp_model.h"
#include "ortools/sat/cp_model_solver.h"

#include <iostream>

using namespace operations_research;
using namespace operations_research::sat;

INITIALIZE_EASYLOGGINGPP
int main(int argc, char* argv[]) {
    START_EASYLOGGINGPP(argc, argv);

    CpModelBuilder cp_model;

    const Domain domain = Domain::FromValues({0, 1, 2, 3, 4, 5, 6, 7});

    auto var1 = cp_model.NewIntVar(domain);
    auto var2 = cp_model.NewIntVar(domain);
    auto var3 = cp_model.NewIntVar(domain);

    cp_model.AddNotEqual(var1, var2);
    cp_model.AddNotEqual(var1, var3);
    cp_model.AddNotEqual(var2, var3);

    SatParameters parameters;
    parameters.set_random_seed(1234);
    Model model;
    model.Add(NewSatParameters(parameters));

    const CpSolverResponse response = SolveCpModel(cp_model.Build(), &model);

    std::cout << CpSolverResponseStats(response) << std::endl;

    return 0;
}

We can compile using something like:

g++ -std=c++17 -O2 -ggdb3 -fno-omit-frame-pointer -pthread -Wall -Wextra -Wcast-align -Wcast-qual \
    -Wdisabled-optimization -Wformat=2 -Winit-self -Wmissing-format-attribute -Wshadow -Wpointer-arith \
    -Wredundant-decls -Wstrict-aliasing=2 -Wfloat-equal -Weffc++ -m64 -fPIC -pthread -fexceptions \
    -I. -I/home/users/tools/or-tools/v9.3/include \
    -fPIC -DARCH_K8 -Wno-deprecated -DUSE_BOP -DUSE_GLOP -DUSE_PDLP -DUSE_CBC -DUSE_CLP -DUSE_SCIP \
    -c test/or_bug.cpp \
    -o test/or_bug.o

The compilation of this code results in several issues, as following, but it compiles:

In file included from test/or_bug.cpp:3:
In file included from /Users/users/tools/or-tools/v9.3/include/ortools/sat/cp_model.h:53:
In file included from /Users/users/tools/or-tools/v9.3/include/ortools/sat/cp_model_solver.h:23:
In file included from /Users/users/tools/or-tools/v9.3/include/ortools/sat/model.h:29:
In file included from /Users/users/tools/or-tools/v9.3/include/ortools/base/logging.h:37:
/Users/users/tools/or-tools/v9.3/include/ortools/base/vlog_is_on.h:29:9: warning: 'VLOG_IS_ON' macro redefined [-Wmacro-redefined]
#define VLOG_IS_ON(verboselevel)                                        \
        ^
./third_part/easylogging++/easylogging++.h:3909:9: note: previous definition is here
#define VLOG_IS_ON(verboseLevel) (ELPP->vRegistry()->allowed(verboseLevel, __FILE__))
        ^
In file included from test/or_bug.cpp:3:
In file included from /Users/users/tools/or-tools/v9.3/include/ortools/sat/cp_model.h:53:
In file included from /Users/users/tools/or-tools/v9.3/include/ortools/sat/cp_model_solver.h:23:
In file included from /Users/users/tools/or-tools/v9.3/include/ortools/sat/model.h:29:
/Users/users/tools/or-tools/v9.3/include/ortools/base/logging.h:420:9: warning: 'LOG' macro redefined [-Wmacro-redefined]
#define LOG(severity) COMPACT_GOOGLE_LOG_##severity.stream()
        ^
...
[several other warnings]
...

You can see that this is very problematic and it is almost sure the source of the problem. Obviously, the warning depends on the headers' order, i.e., if you add the Easylogging header after Google OR-tools headers, Easylogging will redefine the macros of the latter.

This is a trace using GDB:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff5f76613 in std::_Rb_tree_increment(std::_Rb_tree_node_base const*) () from /lib/x86_64-linux-gnu/libstdc++.so.6
(gdb) backtrace 
#0  0x00007ffff5f76613 in std::_Rb_tree_increment(std::_Rb_tree_node_base const*) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00007ffff6fd61ff in google::protobuf::EncodedDescriptorDatabase::DescriptorIndex::EnsureFlat() () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#2  0x00007ffff6fd70fc in google::protobuf::EncodedDescriptorDatabase::DescriptorIndex::FindFile(google::protobuf::stringpiece_internal::StringPiece) () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#3  0x00007ffff6fd7280 in google::protobuf::EncodedDescriptorDatabase::FindFileByName(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, google::protobuf::FileDescriptorProto*) ()
   from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#4  0x00007ffff6f9e836 in google::protobuf::DescriptorPool::TryFindFileInFallbackDatabase(google::protobuf::stringpiece_internal::StringPiece) const () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#5  0x00007ffff6f9eb7c in google::protobuf::DescriptorPool::FindFileByName(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#6  0x00007ffff6ff4a7a in google::protobuf::(anonymous namespace)::AssignDescriptorsImpl(google::protobuf::internal::DescriptorTable const*, bool) () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#7  0x00007ffff5e844df in __pthread_once_slow (once_control=0x7ffff7fb0dd4 <descriptor_table_ortools_2fsat_2fsat_5fparameters_2eproto_once>, init_routine=0x7ffff5f8a4c0 <__once_proxy>) at pthread_once.c:116
#8  0x00007ffff6fe9b8b in google::protobuf::internal::AssignDescriptors(google::protobuf::internal::DescriptorTable const*, bool) () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#9  0x00007ffff66bb626 in operations_research::sat::SatParameters_RestartAlgorithm_descriptor() () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#10 0x00007ffff6e814e2 in operations_research::sat::RestartPolicy::Reset() () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#11 0x00007ffff6e9fe47 in operations_research::sat::SatSolver::SatSolver(operations_research::sat::Model*) () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#12 0x00007ffff6e638d0 in operations_research::sat::LoadModelForProbing(operations_research::sat::PresolveContext*, operations_research::sat::Model*) () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#13 0x00007ffff6c881b0 in operations_research::sat::CpModelPresolver::Probe() () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#14 0x00007ffff6cbe5e8 in operations_research::sat::CpModelPresolver::Presolve() () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#15 0x00007ffff6cbeeb3 in operations_research::sat::PresolveCpModel(operations_research::sat::PresolveContext*, std::vector<int, std::allocator<int> >*) () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#16 0x00007ffff6cfda0b in operations_research::sat::SolveCpModel(operations_research::sat::CpModelProto const&, operations_research::sat::Model*) () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#17 0x0000555555562af5 in main (argc=<optimized out>, argv=<optimized out>) at /home/users/tools/or-tools/v9.3/include/ortools/sat/cp_model.h:1075

Valgrind catches for Easylogging++ side:

==2290== Process terminating with default action of signal 11 (SIGSEGV)
==2290==  Access not within mapped region at address 0x20
==2290==    at 0x122B7A: safeDelete<el::base::RegisteredHitCounters> (easylogging++.h:865)
==2290==    by 0x122B7A: el::base::Storage::~Storage() (easylogging++.cc:2127)
==2290==    by 0x122E15: el::base::Storage::~Storage() (easylogging++.cc:2132)
==2290==    by 0x1652F2: _M_release (shared_ptr_base.h:158)
==2290==    by 0x1652F2: ~__shared_count (shared_ptr_base.h:733)
==2290==    by 0x1652F2: ~__shared_ptr (shared_ptr_base.h:1183)
==2290==    by 0x1652F2: std::shared_ptr<el::base::Storage>::~shared_ptr() (shared_ptr.h:121)
==2290==    by 0x69E88A6: __run_exit_handlers (exit.c:108)
==2290==    by 0x69E8A5F: exit (exit.c:139)
==2290==    by 0x69C6089: (below main) (libc-start.c:342)

The last traces were for the simple example. In my application, the issues propagate to other parts:

Program received signal SIGSEGV, Segmentation fault.
operations_research::sat::Model::Delete<operations_research::TimeLimit>::~Delete (this=0x5555556d6d40, __in_chrg=<optimized out>) at /home/users/tools/or-tools/v9.3/include/ortools/sat/model.h:209
209     ~Delete() override = default;
(gdb) backtrace 
#0  operations_research::sat::Model::Delete<operations_research::TimeLimit>::~Delete (this=0x5555556d6d40, __in_chrg=<optimized out>) at /home/users/tools/or-tools/v9.3/include/ortools/sat/model.h:209
#1  0x00005555555eb5a4 in std::default_delete<operations_research::sat::Model::DeleteInterface>::operator() (__ptr=<optimized out>, this=0x5555556ec968) at /usr/include/c++/10/bits/unique_ptr.h:79
#2  std::__uniq_ptr_impl<operations_research::sat::Model::DeleteInterface, std::default_delete<operations_research::sat::Model::DeleteInterface> >::reset (__p=0x0, this=0x5555556ec968) at /usr/include/c++/10/bits/unique_ptr.h:182
#3  std::unique_ptr<operations_research::sat::Model::DeleteInterface, std::default_delete<operations_research::sat::Model::DeleteInterface> >::reset (__p=0x0, this=0x5555556ec968) at /usr/include/c++/10/bits/unique_ptr.h:456
#4  operations_research::sat::Model::~Model (this=0x7fffffff93a0, __in_chrg=<optimized out>) at /home/users/tools/or-tools/v9.3/include/ortools/sat/model.h:50
#5  0x00007ffff6c89182 in operations_research::sat::CpModelPresolver::Probe() () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#6  0x00007ffff6cbe5e8 in operations_research::sat::CpModelPresolver::Presolve() () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#7  0x00007ffff6cbeeb3 in operations_research::sat::PresolveCpModel(operations_research::sat::PresolveContext*, std::vector<int, std::allocator<int> >*) () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#8  0x00007ffff6cfda0b in operations_research::sat::SolveCpModel(operations_research::sat::CpModelProto const&, operations_research::sat::Model*) () from /home/users/tools/or-tools/v9.3/lib/libortools.so.9
#9  0x00005555555eac2f in pci::feasibility::ortools::find_feasible_solutions_using_cp (instance=..., max_time=max_time@entry=12, max_num_solutions=<optimized out>, max_num_solutions@entry=10, seed=seed@entry=3429608289, 
    num_threads=num_threads@entry=1) at /home/users/tools/or-tools/v9.3/include/ortools/sat/cp_model.h:1075
#10 0x00005555555f3ac8 in pci::PCIOptimizer::run (this=0x7fffffffdbc8, maximum_time=<optimized out>) at pci/assignment_optimizer.cpp:183
#11 0x000055555556a490 in main (argc=<optimized out>, argv=<optimized out>) at pci/main.cpp:343

One way to solve this problem is to drop Easylogging++. But this is really a hard sell because it is the de-facto logging this project uses. Many people will be upset to change to something else.

So, is there a way to disable or wrap the OR-Tools logging facilities? This would be the best option.

Now, to upset my fellow devs, I could argue to move to [Google Logging Library]((https://github.com/google/glog)? Is Google OR-Tools compatible in this case?

Thanks!

lperron commented 2 years ago

glog is deprecated. OR-Tools contains a re-implementation of glog, but on top of the abseil-cpp package. It seems that easylogging is also a reimplementation of easylogging++.

The path forward for us is to wait for glog to be fully re-implemented in abseil-cpp (it is only done partially), then we will drop on glog reimplementation.

You could try moving from easylogging to the ortools/base/logging.h. The main diff I see is the parsing of the parameters, which is done using the abseil flags module. See https://github.com/google/or-tools/blob/stable/examples/cpp/jobshop_sat.cc#L876

Or you can try encapsulating or-tools to never be included alongside easylogging.

Now, the crash may not be related to easylogging interaction.

Mizux commented 2 years ago

0) Unfortunately Easylogging++ and base logging use the same macro name => conflict/undefined behaviour.

1) Glog is no more maintained/use by Google and is currently maintained by the community. so If I were you...

2) you may try to hack ortools base logging to reuse easylogging++ if semantic is identical aka logging.h become a proxy to include easylogging header etc...

3) base/logging is just an ugly hack of Glog BUT using absl::flags instead of gflags (which is also dead and already replaced by absl::flags)

4) We are waiting for absl::logging literally for years (2019 IIRC) but maybe 2022 will be the year ! So base/logging may be wipeout in favor of abs::logging in a near future...

Some ref:

Mizux commented 2 years ago

I fear abseil-cpp:log will also contains conflicting macro with easylogging++ so the migration won't fix conflicts currently... https://github.com/abseil/abseil-cpp/search?q=VLOG_IS_ON

sschnug commented 1 year ago

I fear abseil-cpp:log will also contains conflicting macro with easylogging++ so the migration won't fix conflicts currently... https://github.com/abseil/abseil-cpp/search?q=VLOG_IS_ON

Did something change here? The search returns zero results which is probably not what you saw?

I guess "commit: Remove internal VLOG_xxx macros" might change something?

lperron commented 1 year ago

Some news. I pushed the code on main that uses the new abseil, as well as implement the VLOG macros. (see ortools/base/vlog.h).

At this point, I will close the bug as the way forward is just to wait for vlog to be implemented in abseil to remove our code.