google / or-tools

Google's Operations Research tools:
https://developers.google.com/optimization/
Apache License 2.0
11.14k stars 2.11k forks source link

std:bad_alloc in error inside c++ libraries #840

Closed billywhizz closed 5 years ago

billywhizz commented 6 years ago

Hi,

We have a reproducible issue where we get an std:bad_alloc error using or-tools python modules to optimize a schedule for an internal system. This is the error we are seeing and is happening inside the C++ libraries:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Fatal Python error: Aborted

Current thread 0x00007f0108ef1700 (most recent call first):
  File "/usr/local/lib/python3.6/site-packages/ortools/sat/python/cp_model.py", line 1133 in SolveWithSolutionObserver
  File "sched.py", line 369 in plan_events
  File "sched.py", line 378 in <module>
Fatal Python error: Segmentation fault

Current thread 0x00007f0108ef1700 (most recent call first):
  File "/usr/local/lib/python3.6/site-packages/ortools/sat/python/cp_model.py", line 1133 in SolveWithSolutionObserver
  File "sched.py", line 369 in plan_events
  File "sched.py", line 378 in <module>

There is a Dockerfile with all necessary files to reproduce the issue here: https://gist.github.com/billywhizz/f3763adcae353aae18846c1e6d3b31cd

Just run this to reproduce:

docker build -t ortools-test .
docker run -it --rm ortools-test

I have tried with ubuntu 18.04 base image also and python 3.7 but see exact same issue. Am not sure how best to debug further.

billywhizz commented 6 years ago

More info from a debug backtrace of the dump file:

/usr/local/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
  return f(*args, **kwds)
2018-08-29 21:11:36.267432+00:00
Events = 5456
Horizon = 33049
solving
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Fatal Python error: Aborted

Current thread 0x00007fd525c65700 (most recent call first):
  File "/usr/local/lib/python3.6/site-packages/ortools/sat/python/cp_model.py", line 1133 in SolveWithSolutionObserver
  File "sched.py", line 369 in plan_events
  File "sched.py", line 378 in <module>
Aborted (core dumped)

#0  0x00007fd52584575b in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1  <signal handler called>
#2  0x00007fd524db8067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#3  0x00007fd524db9448 in __GI_abort () at abort.c:89
#4  0x00007fd50f820b3d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007fd50f81ebb6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007fd50f81ec01 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007fd50f81ee19 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007fd50f81f339 in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007fd509c8aa2e in void std::vector<operations_research::sat::IntegerLiteral, std::allocator<operations_research::sat::IntegerLiteral> >::_M_range_insert<operations_research::sat::IntegerLiteral const*>(__gnu_cxx::__normal_iterator<operations_research::sat::IntegerLiteral*, std::vector<operations_research::sat::IntegerLiteral, std::allocator<operations_research::sat::IntegerLiteral> > >, operations_research::sat::IntegerLiteral const*, operations_research::sat::IntegerLiteral const*, std::forward_iterator_tag) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#10 0x00007fd509c872ff in operations_research::sat::IntegerTrail::Enqueue(operations_research::sat::IntegerLiteral, absl::Span<operations_research::sat::Literal>, absl::Span<operations_research::sat::IntegerLiteral>, int) ()
   from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#11 0x00007fd509c874bb in operations_research::sat::IntegerTrail::Enqueue(operations_research::sat::IntegerLiteral, absl::Span<operations_research::sat::Literal>, absl::Span<operations_research::sat::IntegerLiteral>) ()
   from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#12 0x00007fd509c9902f in operations_research::sat::SchedulingConstraintHelper::PushIntervalBound(int, operations_research::sat::IntegerLiteral) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#13 0x00007fd509c9913c in operations_research::sat::SchedulingConstraintHelper::IncreaseStartMin(int, IntType<operations_research::sat::IntegerValue_tag_, long long>) ()
   from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#14 0x00007fd509c69296 in operations_research::sat::DisjunctiveDetectablePrecedences::Propagate() () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#15 0x00007fd509c82cf3 in operations_research::sat::GenericLiteralWatcher::Propagate(operations_research::sat::Trail*) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#16 0x00007fd509cdcfa1 in operations_research::sat::SatSolver::Propagate() () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#17 0x00007fd509ce4ab9 in operations_research::sat::SatSolver::PropagateAndStopAfterOneConflictResolution() () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#18 0x00007fd509ce56f1 in operations_research::sat::SatSolver::EnqueueDecisionAndBackjumpOnConflict(operations_research::sat::Literal) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#19 0x00007fd509c94946 in operations_research::sat::SolveProblemWithPortfolioSearch(std::vector<std::function<IntType<operations_research::sat::LiteralIndex_tag_, int> ()>, std::allocator<std::function<IntType<operations_research::sat::LiteralIndex_tag_, int> ()> > >, std::vector<std::function<bool ()>, std::allocator<std::function<bool ()> > >, operations_research::sat::Model*) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#20 0x00007fd509c95ddd in operations_research::sat::SolveIntegerProblemWithLazyEncoding(std::vector<operations_research::sat::Literal, std::allocator<operations_research::sat::Literal> > const&, std::function<IntType<operations_research::sat::LiteralIndex_tag_, int> ()> const&, operations_research::sat::Model*) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#21 0x00007fd509cb60aa in operations_research::sat::MinimizeIntegerVariableWithLinearScanAndLazyEncoding(bool, IntType<operations_research::sat::IntegerVariable_tag_, int>, std::function<IntType<operations_research::sat::LiteralIndex_tag_, int> ()> const&, std::function<void (operations_research::sat::Model const&)> const&, operations_research::sat::Model*) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#22 0x00007fd509c4320f in operations_research::sat::(anonymous namespace)::SolveCpModelInternal(operations_research::sat::CpModelProto const&, bool, std::function<void (operations_research::sat::CpSolverResponse const&)> const&, operations_research::sat::Model*) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#23 0x00007fd509c4922b in operations_research::sat::SolveCpModel(operations_research::sat::CpModelProto const&, operations_research::sat::Model*) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so
#24 0x00007fd50cdad3ed in _wrap_SatHelper_SolveWithParametersAndSolutionObserver () from /usr/local/lib/python3.6/site-packages/ortools/sat/_pywrapsat.so
#25 0x00000000004ac1f9 in _PyCFunction_FastCallDict (kwargs=0x0, nargs=<optimized out>, args=0x7fd5059933b0, func_obj=0x7fd50cfe7cf0) at Objects/methodobject.c:234
#26 _PyCFunction_FastCallKeywords (func=func@entry=0x7fd50cfe7cf0, stack=stack@entry=0x7fd5059933b0, nargs=<optimized out>, kwnames=kwnames@entry=0x0) at Objects/methodobject.c:294
#27 0x0000000000541fcb in call_function (pp_stack=pp_stack@entry=0x7fff4fd776f0, oparg=<optimized out>, kwnames=kwnames@entry=0x0) at Python/ceval.c:4830
#28 0x0000000000547f07 in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3328
#29 0x0000000000540fd1 in PyEval_EvalFrameEx (throwflag=0, f=0x7fd505993218) at Python/ceval.c:754
#30 _PyFunction_FastCall (co=<optimized out>, args=<optimized out>, nargs=3, globals=globals@entry=0x7fd50dd29a20) at Python/ceval.c:4912
#31 0x0000000000542135 in fast_function (kwnames=0x0, nargs=<optimized out>, stack=<optimized out>, func=0x7fd50cff29d8) at Python/ceval.c:4947
#32 call_function (pp_stack=pp_stack@entry=0x7fff4fd778a0, oparg=<optimized out>, kwnames=kwnames@entry=0x0) at Python/ceval.c:4851
#33 0x0000000000547f07 in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3328
#34 0x0000000000541c50 in PyEval_EvalFrameEx (throwflag=<optimized out>, f=<optimized out>) at Python/ceval.c:754
#35 _PyEval_EvalCodeWithName (_co=0x93, globals=0x93, globals@entry=0x7fd524d3a120, locals=0x6, locals@entry=0x0, args=0x1834828, argcount=140553438511120, kwnames=0x7fd525c65700, kwargs=0x1834838, kwcount=0, kwstep=1, defs=0x0, 
    defcount=0, kwdefs=0x0, closure=0x0, name=0x7fd52485b870, qualname=0x7fd52485b870) at Python/ceval.c:4159
#36 0x0000000000541ee3 in fast_function (kwnames=0x0, nargs=<optimized out>, stack=<optimized out>, func=0x7fd509434378) at Python/ceval.c:4971
#37 call_function (pp_stack=pp_stack@entry=0x7fff4fd77b00, oparg=<optimized out>, kwnames=kwnames@entry=0x0) at Python/ceval.c:4851
#38 0x0000000000547f07 in _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3328
#39 0x0000000000541c50 in PyEval_EvalFrameEx (throwflag=<optimized out>, f=<optimized out>) at Python/ceval.c:754
#40 _PyEval_EvalCodeWithName (_co=0x93, _co@entry=0x7fd524975300, globals=0x93, globals@entry=0x7fd524975300, locals=0x6, locals@entry=0x7fd524d54150, args=0x0, argcount=140553438511120, argcount@entry=0, kwnames=0x7fd525c65700, 
    kwnames@entry=0x0, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:4159
#41 0x0000000000542af3 in PyEval_EvalCodeEx (closure=0x0, kwdefs=0x0, defcount=0, defs=0x0, kwcount=0, kws=0x0, argcount=0, args=0x0, locals=locals@entry=0x7fd524d54150, globals=globals@entry=0x7fd524975300, _co=_co@entry=0x7fd524975300)
    at Python/ceval.c:4180
#42 PyEval_EvalCode (co=co@entry=0x7fd524975300, globals=globals@entry=0x7fd524d3a120, locals=locals@entry=0x7fd524d3a120) at Python/ceval.c:731
#43 0x0000000000426ece in run_mod (arena=0x7fd524d54150, flags=0x7fff4fd77dd0, locals=0x7fd524d3a120, globals=0x7fd524d3a120, filename=0x7fd524965170, mod=0x18ac678) at Python/pythonrun.c:1025
#44 PyRun_FileExFlags (fp=0x186b5b0, filename_str=<optimized out>, start=<optimized out>, globals=0x7fd524d3a120, locals=0x7fd524d3a120, closeit=1, flags=0x7fff4fd77dd0) at Python/pythonrun.c:978
#45 0x00000000004270cc in PyRun_SimpleFileExFlags (fp=0x186b5b0, filename=0x7fd5249aa500 "sched.py", closeit=1, flags=0x7fd524db8067 <__GI_raise+55>) at Python/pythonrun.c:420
#46 0x000000000043c73c in run_file (p_cf=0x7fff4fd77dd0, filename=0x17db9b0 L"sched.py", fp=0x186b5b0) at Modules/main.c:340
#47 Py_Main (argc=argc@entry=6, argv=argv@entry=0x17d9010) at Modules/main.c:810
#48 0x000000000041e10f in main (argc=6, argv=<optimized out>) at ./Programs/python.c:69
lperron commented 6 years ago

How big is your model? Laurent Perron | Operations Research | lperron@google.com | (33) 1 42 68 53 00

Le mar. 4 sept. 2018 à 16:50, Andrew Johnston notifications@github.com a écrit :

More info from a debug backtrack of the dump file:

/usr/local/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88 return f(*args, **kwds) 2018-08-29 21:11:36.267432+00:00 Events = 5456 Horizon = 33049 solving terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Fatal Python error: Aborted

Current thread 0x00007fd525c65700 (most recent call first): File "/usr/local/lib/python3.6/site-packages/ortools/sat/python/cp_model.py", line 1133 in SolveWithSolutionObserver File "sched.py", line 369 in plan_events File "sched.py", line 378 in Aborted (core dumped)

0 0x00007fd52584575b in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37

1

2 0x00007fd524db8067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56

3 0x00007fd524db9448 in __GI_abort () at abort.c:89

4 0x00007fd50f820b3d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6

5 0x00007fd50f81ebb6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6

6 0x00007fd50f81ec01 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6

7 0x00007fd50f81ee19 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6

8 0x00007fd50f81f339 in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6

9 0x00007fd509c8aa2e in void std::vector<operations_research::sat::IntegerLiteral, std::allocator >::_M_range_insert<operations_research::sat::IntegerLiteral const>(__gnu_cxx::__normal_iterator<operations_research::sat::IntegerLiteral, std::vector<operations_research::sat::IntegerLiteral, std::allocator > >, operations_research::sat::IntegerLiteral const, operations_research::sat::IntegerLiteral const, std::forward_iterator_tag) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

10 0x00007fd509c872ff in operations_research::sat::IntegerTrail::Enqueue(operations_research::sat::IntegerLiteral, absl::Span, absl::Span, int) ()

from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

11 0x00007fd509c874bb in operations_research::sat::IntegerTrail::Enqueue(operations_research::sat::IntegerLiteral, absl::Span, absl::Span) ()

from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

12 0x00007fd509c9902f in operations_research::sat::SchedulingConstraintHelper::PushIntervalBound(int, operations_research::sat::IntegerLiteral) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

13 0x00007fd509c9913c in operations_research::sat::SchedulingConstraintHelper::IncreaseStartMin(int, IntType<operations_research::sat::IntegerValuetag, long long>) ()

from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

14 0x00007fd509c69296 in operations_research::sat::DisjunctiveDetectablePrecedences::Propagate() () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

15 0x00007fd509c82cf3 in operations_research::sat::GenericLiteralWatcher::Propagate(operations_research::sat::Trail*) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

16 0x00007fd509cdcfa1 in operations_research::sat::SatSolver::Propagate() () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

17 0x00007fd509ce4ab9 in operations_research::sat::SatSolver::PropagateAndStopAfterOneConflictResolution() () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

18 0x00007fd509ce56f1 in operations_research::sat::SatSolver::EnqueueDecisionAndBackjumpOnConflict(operations_research::sat::Literal) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

19 0x00007fd509c94946 in operations_research::sat::SolveProblemWithPortfolioSearch(std::vector<std::function<IntType<operations_research::sat::LiteralIndextag, int> ()>, std::allocator<std::function<IntType<operations_research::sat::LiteralIndextag, int> ()> > >, std::vector<std::function<bool ()>, std::allocator<std::function<bool ()> > >, operations_research::sat::Model*) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

20 0x00007fd509c95ddd in operations_research::sat::SolveIntegerProblemWithLazyEncoding(std::vector<operations_research::sat::Literal, std::allocator > const&, std::function<IntType<operations_research::sat::LiteralIndextag, int> ()> const&, operations_research::sat::Model*) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

21 0x00007fd509cb60aa in operations_research::sat::MinimizeIntegerVariableWithLinearScanAndLazyEncoding(bool, IntType<operations_research::sat::IntegerVariabletag, int>, std::function<IntType<operations_research::sat::LiteralIndextag, int> ()> const&, std::function<void (operations_research::sat::Model const&)> const&, operations_research::sat::Model*) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

22 0x00007fd509c4320f in operations_research::sat::(anonymous namespace)::SolveCpModelInternal(operations_research::sat::CpModelProto const&, bool, std::function<void (operations_research::sat::CpSolverResponse const&)> const&, operations_research::sat::Model*) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

23 0x00007fd509c4922b in operations_research::sat::SolveCpModel(operations_research::sat::CpModelProto const&, operations_research::sat::Model*) () from /usr/local/lib/python3.6/site-packages/ortools/sat/../.libs/libortools.so

24 0x00007fd50cdad3ed in _wrap_SatHelper_SolveWithParametersAndSolutionObserver () from /usr/local/lib/python3.6/site-packages/ortools/sat/_pywrapsat.so

25 0x00000000004ac1f9 in _PyCFunction_FastCallDict (kwargs=0x0, nargs=, args=0x7fd5059933b0, func_obj=0x7fd50cfe7cf0) at Objects/methodobject.c:234

26 _PyCFunction_FastCallKeywords (func=func@entry=0x7fd50cfe7cf0, stack=stack@entry=0x7fd5059933b0, nargs=, kwnames=kwnames@entry=0x0) at Objects/methodobject.c:294

27 0x0000000000541fcb in call_function (pp_stack=pp_stack@entry=0x7fff4fd776f0, oparg=, kwnames=kwnames@entry=0x0) at Python/ceval.c:4830

28 0x0000000000547f07 in _PyEval_EvalFrameDefault (f=, throwflag=) at Python/ceval.c:3328

29 0x0000000000540fd1 in PyEval_EvalFrameEx (throwflag=0, f=0x7fd505993218) at Python/ceval.c:754

30 _PyFunction_FastCall (co=, args=, nargs=3, globals=globals@entry=0x7fd50dd29a20) at Python/ceval.c:4912

31 0x0000000000542135 in fast_function (kwnames=0x0, nargs=, stack=, func=0x7fd50cff29d8) at Python/ceval.c:4947

32 call_function (pp_stack=pp_stack@entry=0x7fff4fd778a0, oparg=, kwnames=kwnames@entry=0x0) at Python/ceval.c:4851

33 0x0000000000547f07 in _PyEval_EvalFrameDefault (f=, throwflag=) at Python/ceval.c:3328

34 0x0000000000541c50 in PyEval_EvalFrameEx (throwflag=, f=) at Python/ceval.c:754

35 _PyEval_EvalCodeWithName (_co=0x93, globals=0x93, globals@entry=0x7fd524d3a120, locals=0x6, locals@entry=0x0, args=0x1834828, argcount=140553438511120, kwnames=0x7fd525c65700, kwargs=0x1834838, kwcount=0, kwstep=1, defs=0x0,

defcount=0, kwdefs=0x0, closure=0x0, name=0x7fd52485b870, qualname=0x7fd52485b870) at Python/ceval.c:4159

36 0x0000000000541ee3 in fast_function (kwnames=0x0, nargs=, stack=, func=0x7fd509434378) at Python/ceval.c:4971

37 call_function (pp_stack=pp_stack@entry=0x7fff4fd77b00, oparg=, kwnames=kwnames@entry=0x0) at Python/ceval.c:4851

38 0x0000000000547f07 in _PyEval_EvalFrameDefault (f=, throwflag=) at Python/ceval.c:3328

39 0x0000000000541c50 in PyEval_EvalFrameEx (throwflag=, f=) at Python/ceval.c:754

40 _PyEval_EvalCodeWithName (_co=0x93, _co@entry=0x7fd524975300, globals=0x93, globals@entry=0x7fd524975300, locals=0x6, locals@entry=0x7fd524d54150, args=0x0, argcount=140553438511120, argcount@entry=0, kwnames=0x7fd525c65700,

kwnames@entry=0x0, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:4159

41 0x0000000000542af3 in PyEval_EvalCodeEx (closure=0x0, kwdefs=0x0, defcount=0, defs=0x0, kwcount=0, kws=0x0, argcount=0, args=0x0, locals=locals@entry=0x7fd524d54150, globals=globals@entry=0x7fd524975300, _co=_co@entry=0x7fd524975300)

at Python/ceval.c:4180

42 PyEval_EvalCode (co=co@entry=0x7fd524975300, globals=globals@entry=0x7fd524d3a120, locals=locals@entry=0x7fd524d3a120) at Python/ceval.c:731

43 0x0000000000426ece in run_mod (arena=0x7fd524d54150, flags=0x7fff4fd77dd0, locals=0x7fd524d3a120, globals=0x7fd524d3a120, filename=0x7fd524965170, mod=0x18ac678) at Python/pythonrun.c:1025

44 PyRun_FileExFlags (fp=0x186b5b0, filename_str=, start=, globals=0x7fd524d3a120, locals=0x7fd524d3a120, closeit=1, flags=0x7fff4fd77dd0) at Python/pythonrun.c:978

45 0x00000000004270cc in PyRun_SimpleFileExFlags (fp=0x186b5b0, filename=0x7fd5249aa500 "sched.py", closeit=1, flags=0x7fd524db8067 <__GI_raise+55>) at Python/pythonrun.c:420

46 0x000000000043c73c in run_file (p_cf=0x7fff4fd77dd0, filename=0x17db9b0 L"sched.py", fp=0x186b5b0) at Modules/main.c:340

47 Py_Main (argc=argc@entry=6, argv=argv@entry=0x17d9010) at Modules/main.c:810

48 0x000000000041e10f in main (argc=6, argv=) at ./Programs/python.c:69

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google/or-tools/issues/840#issuecomment-418396474, or mute the thread https://github.com/notifications/unsubscribe-auth/AKj17QluqUXUXMEeWiZrikZiQG7RbUNgks5uXpMrgaJpZM4WZIxo .

joseandrespg commented 6 years ago

Hi @lperron, the data set we were testing (attached in the issue) is form by 5456 events and the horizon is up to 33049 minutes.

In our case, these numbers can potentially be much bigger than that.

billywhizz commented 6 years ago

Yes. When testing I tried narrowing the events list and only seemed able to trigger the error when we went above 5000 events. It was approx 5010 or 5020 events where it started happening consistently.

lperron commented 6 years ago

I guess you are hitting the memory limit of your computer. Is you machine slowing down just before the crash?

billywhizz commented 6 years ago

Hi Laurent. I am pretty sure it's not hitting a memory limit. I am running inside docker on a host with 16GB available. when i monitor it i can see RSS growing to approx 5GB ram and then it dies. if there is anything i can do to capture more useful info please let me know. you should be able to clone the gist and run using docker on any machine that has it available.

billywhizz commented 6 years ago

here is a full backtrace from a run i just did: https://gist.github.com/billywhizz/e1f8b622a5c123e833532ff9fac8f782

am running on ubuntu 18.04 with 16GB RAM and docker version 17.12.1-ce

billywhizz commented 6 years ago

it mentions here: https://stackoverflow.com/questions/2481933/debugging-strategy-to-find-the-cause-of-bad-alloc that bad_alloc can be thrown "by a limiting memory pool designed for use with STL containers. When the size limit was hit, it threw bad_alloc and the software just had to handle it."

This seems to me a more likely cause of the issue as it's happening when trying to insert into an std::vector: https://gist.github.com/billywhizz/e1f8b622a5c123e833532ff9fac8f782#file-or-tools-trace-L96

My guess at moment is it's happening somewhere around here: https://github.com/google/or-tools/blob/f974064e467cf8a875fcc404164aa56bf4c9463c/ortools/sat/intervals.cc#L157 but am not sure how to debug further without doing a debug build etc.

billywhizz commented 6 years ago

running with valgrind does not seem to shed any light either:

**211** new/new[] failed and should throw an exception, but Valgrind
**211**    cannot throw exceptions and so is aborting instead.  Sorry.
==211==    at 0x4C289FC: VALGRIND_PRINTF_BACKTRACE (valgrind.h:6280)
==211==    by 0x4C291F5: operator new(unsigned long) (vg_replace_malloc.c:324)
==211==    by 0x22AB0A2D: void std::vector<operations_research::sat::IntegerLiteral, std::allocator<operations_research::sat::IntegerLiteral> >::_M_range_insert<operations_research::sat::IntegerLiteral const*>(__gnu_cxx::__normal_iterator<o
perations_research::sat::IntegerLiteral*, std::vector<operations_research::sat::IntegerLiteral, std::allocator<operations_research::sat::IntegerLiteral> > >, operations_research::sat::IntegerLiteral const*, operations_research::sat::Integer
Literal const*, std::forward_iterator_tag) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so)
==211==    by 0x22AAD2FE: operations_research::sat::IntegerTrail::Enqueue(operations_research::sat::IntegerLiteral, absl::Span<operations_research::sat::Literal>, absl::Span<operations_research::sat::IntegerLiteral>, int) (in /usr/local/lib
/python3.6/site-packages/ortools/.libs/libortools.so)
==211==    by 0x22AAD4BA: operations_research::sat::IntegerTrail::Enqueue(operations_research::sat::IntegerLiteral, absl::Span<operations_research::sat::Literal>, absl::Span<operations_research::sat::IntegerLiteral>) (in /usr/local/lib/pyth
on3.6/site-packages/ortools/.libs/libortools.so)
==211==    by 0x22ABF02E: operations_research::sat::SchedulingConstraintHelper::PushIntervalBound(int, operations_research::sat::IntegerLiteral) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so)
==211==    by 0x22ABF13B: operations_research::sat::SchedulingConstraintHelper::IncreaseStartMin(int, IntType<operations_research::sat::IntegerValue_tag_, long long>) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so)
==211==    by 0x22A8F295: operations_research::sat::DisjunctiveDetectablePrecedences::Propagate() (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so)
==211==    by 0x22AA8CF2: operations_research::sat::GenericLiteralWatcher::Propagate(operations_research::sat::Trail*) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so)
==211==    by 0x22B02FA0: operations_research::sat::SatSolver::Propagate() (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so)
==211==    by 0x22B0AAB8: operations_research::sat::SatSolver::PropagateAndStopAfterOneConflictResolution() (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so)
==211==    by 0x22B0B6F0: operations_research::sat::SatSolver::EnqueueDecisionAndBackjumpOnConflict(operations_research::sat::Literal) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so)
==211== 
==211== HEAP SUMMARY:
==211==     in use at exit: 5,015,382,355 bytes in 390,921 blocks
==211==   total heap usage: 4,305,448 allocs, 3,914,527 frees, 10,479,007,039 bytes allocated
==211== 
==211== LEAK SUMMARY:
==211==    definitely lost: 64 bytes in 2 blocks
==211==    indirectly lost: 66 bytes in 4 blocks
==211==      possibly lost: 4,614,685 bytes in 76,484 blocks
==211==    still reachable: 5,010,767,540 bytes in 314,431 blocks
==211==         suppressed: 0 bytes in 0 blocks
lperron commented 6 years ago

Yes, I think it is a memory limit in the STL. I do not know yet if we can change this. Laurent Perron | Operations Research | lperron@google.com | (33) 1 42 68 53 00

Le jeu. 6 sept. 2018 à 00:32, Andrew Johnston notifications@github.com a écrit :

running with valgrind does not seem to shed any light either:

211 new/new[] failed and should throw an exception, but Valgrind 211 cannot throw exceptions and so is aborting instead. Sorry. ==211== at 0x4C289FC: VALGRIND_PRINTF_BACKTRACE (valgrind.h:6280) ==211== by 0x4C291F5: operator new(unsigned long) (vg_replace_malloc.c:324) ==211== by 0x22AB0A2D: void std::vector<operations_research::sat::IntegerLiteral, std::allocator >::_M_range_insert<operations_research::sat::IntegerLiteral const>(__gnu_cxx::__normal_iterator<o perations_research::sat::IntegerLiteral, std::vector<operations_research::sat::IntegerLiteral, std::allocator > >, operations_research::sat::IntegerLiteral const, operations_research::sat::Integer Literal const, std::forward_iterator_tag) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22AAD2FE: operations_research::sat::IntegerTrail::Enqueue(operations_research::sat::IntegerLiteral, absl::Span, absl::Span, int) (in /usr/local/lib /python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22AAD4BA: operations_research::sat::IntegerTrail::Enqueue(operations_research::sat::IntegerLiteral, absl::Span, absl::Span) (in /usr/local/lib/pyth on3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22ABF02E: operations_research::sat::SchedulingConstraintHelper::PushIntervalBound(int, operations_research::sat::IntegerLiteral) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22ABF13B: operations_research::sat::SchedulingConstraintHelper::IncreaseStartMin(int, IntType<operations_research::sat::IntegerValuetag, long long>) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22A8F295: operations_research::sat::DisjunctiveDetectablePrecedences::Propagate() (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22AA8CF2: operations_research::sat::GenericLiteralWatcher::Propagate(operations_research::sat::Trail*) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22B02FA0: operations_research::sat::SatSolver::Propagate() (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22B0AAB8: operations_research::sat::SatSolver::PropagateAndStopAfterOneConflictResolution() (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22B0B6F0: operations_research::sat::SatSolver::EnqueueDecisionAndBackjumpOnConflict(operations_research::sat::Literal) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== ==211== HEAP SUMMARY: ==211== in use at exit: 5,015,382,355 bytes in 390,921 blocks ==211== total heap usage: 4,305,448 allocs, 3,914,527 frees, 10,479,007,039 bytes allocated ==211== ==211== LEAK SUMMARY: ==211== definitely lost: 64 bytes in 2 blocks ==211== indirectly lost: 66 bytes in 4 blocks ==211== possibly lost: 4,614,685 bytes in 76,484 blocks ==211== still reachable: 5,010,767,540 bytes in 314,431 blocks ==211== suppressed: 0 bytes in 0 blocks

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/google/or-tools/issues/840#issuecomment-418902913, or mute the thread https://github.com/notifications/unsubscribe-auth/AKj17djSkerjyGV5_r99YAR43fn6ZFmKks5uYFEUgaJpZM4WZIxo .

lperron commented 6 years ago

But the program has already allocated 10 GB, with other things running in memory, maybe you are hitting the memory limit. Laurent Perron | Operations Research | lperron@google.com | (33) 1 42 68 53 00

Le jeu. 6 sept. 2018 à 08:20, Laurent Perron lperron@google.com a écrit :

Yes, I think it is a memory limit in the STL. I do not know yet if we can change this. Laurent Perron | Operations Research | lperron@google.com | (33) 1 42 68 53 00

Le jeu. 6 sept. 2018 à 00:32, Andrew Johnston notifications@github.com a écrit :

running with valgrind does not seem to shed any light either:

211 new/new[] failed and should throw an exception, but Valgrind 211 cannot throw exceptions and so is aborting instead. Sorry. ==211== at 0x4C289FC: VALGRIND_PRINTF_BACKTRACE (valgrind.h:6280) ==211== by 0x4C291F5: operator new(unsigned long) (vg_replace_malloc.c:324) ==211== by 0x22AB0A2D: void std::vector<operations_research::sat::IntegerLiteral, std::allocator >::_M_range_insert<operations_research::sat::IntegerLiteral const>(__gnu_cxx::__normal_iterator<o perations_research::sat::IntegerLiteral, std::vector<operations_research::sat::IntegerLiteral, std::allocator > >, operations_research::sat::IntegerLiteral const, operations_research::sat::Integer Literal const, std::forward_iterator_tag) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22AAD2FE: operations_research::sat::IntegerTrail::Enqueue(operations_research::sat::IntegerLiteral, absl::Span, absl::Span, int) (in /usr/local/lib /python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22AAD4BA: operations_research::sat::IntegerTrail::Enqueue(operations_research::sat::IntegerLiteral, absl::Span, absl::Span) (in /usr/local/lib/pyth on3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22ABF02E: operations_research::sat::SchedulingConstraintHelper::PushIntervalBound(int, operations_research::sat::IntegerLiteral) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22ABF13B: operations_research::sat::SchedulingConstraintHelper::IncreaseStartMin(int, IntType<operations_research::sat::IntegerValuetag, long long>) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22A8F295: operations_research::sat::DisjunctiveDetectablePrecedences::Propagate() (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22AA8CF2: operations_research::sat::GenericLiteralWatcher::Propagate(operations_research::sat::Trail*) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22B02FA0: operations_research::sat::SatSolver::Propagate() (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22B0AAB8: operations_research::sat::SatSolver::PropagateAndStopAfterOneConflictResolution() (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== by 0x22B0B6F0: operations_research::sat::SatSolver::EnqueueDecisionAndBackjumpOnConflict(operations_research::sat::Literal) (in /usr/local/lib/python3.6/site-packages/ortools/.libs/libortools.so) ==211== ==211== HEAP SUMMARY: ==211== in use at exit: 5,015,382,355 bytes in 390,921 blocks ==211== total heap usage: 4,305,448 allocs, 3,914,527 frees, 10,479,007,039 bytes allocated ==211== ==211== LEAK SUMMARY: ==211== definitely lost: 64 bytes in 2 blocks ==211== indirectly lost: 66 bytes in 4 blocks ==211== possibly lost: 4,614,685 bytes in 76,484 blocks ==211== still reachable: 5,010,767,540 bytes in 314,431 blocks ==211== suppressed: 0 bytes in 0 blocks

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/google/or-tools/issues/840#issuecomment-418902913, or mute the thread https://github.com/notifications/unsubscribe-auth/AKj17djSkerjyGV5_r99YAR43fn6ZFmKks5uYFEUgaJpZM4WZIxo .

tjvananne commented 4 years ago

Was there any resolution to this issue? I am running a very large ortools model right now and am consistently running into this same issue.

lperron commented 4 years ago

There are no solutions. You are using all your memory. You need to decompose your model into a smaller one.

Le mar. 24 déc. 2019 à 00:58, Taylor Van Anne notifications@github.com a écrit :

Was there any resolution to this issue? I am running a very large ortools model right now and am consistently running into this same issue.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/google/or-tools/issues/840?email_source=notifications&email_token=ACUPL3MLMGHFLZGNRW62TNDQ2FGD7A5CNFSM4FTERRUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHSEUFY#issuecomment-568609303, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUPL3JFNEZ5PZL7SWTQB63Q2FGD7ANCNFSM4FTERRUA .