NeuralNetworkVerification / Marabou

Other
239 stars 86 forks source link

Test suite has started SEGFAULTING locally #784

Open MatthewDaggitt opened 3 months ago

MatthewDaggitt commented 3 months ago

Since I last was hacking on Marabou two weeks ago, the test suite has started to fail for me whenever I try and build Marabou. I haven't changed anything in my environment apart from pulling the latest version of master. Is anyone else experiencing this problem or have any ideas why it might be happening?

20/76 Test #36: Test_PermutationMatrix .............***Exception: SegFault  0.29 sec
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer:DEADLYSIGNAL

Total Test time (real) =   5.05 sec

The following tests FAILED:
      8 - Test_DegradationChecker (SEGFAULT)
      9 - Test_DisjunctionConstraint (SEGFAULT)
     10 - Test_DnCWorker (SEGFAULT)
     11 - Test_Engine (SEGFAULT)
     12 - Test_Equation (SEGFAULT)
     17 - Test_MILPEncoder (SEGFAULT)
     34 - Test_LUFactorization (SEGFAULT)
     36 - Test_PermutationMatrix (SEGFAULT)
     43 - Test_SparseUnsortedList (SEGFAULT)
     49 - Test_GurobiWrapper (SEGFAULT)
     53 - Test_LinearExpression (SEGFAULT)
     55 - Test_MString (SEGFAULT)
     56 - Test_MStringf (SEGFAULT)
     57 - Test_Map (SEGFAULT)
     62 - Test_Vector (SEGFAULT)
     63 - Test_MatrixMultiplication (SEGFAULT)
     73 - Test_OnnxParser (SEGFAULT)
     75 - Test_QueryLoader (SEGFAULT)
     77 - Test_NetworkLevelReasoner (SEGFAULT)
     78 - Test_WsLayerElimination (SEGFAULT)
     81 - Test_Checker (SEGFAULT)
     83 - Test_UnsatCertificateNode (SEGFAULT)
     84 - Test_UnsatCertificateUtils (SEGFAULT)
wu-haoze commented 3 months ago

@MatthewDaggitt , to understand the issue better,

  1. after pulling from the master and rebuild, do you recall it downloading and re-compiling external packages?
  2. What if you remove all downloaded packages from the tools/ directory and recompile? I wonder whether it's because the packages in the tools directory were built for C++11 instead of C++17.
MatthewDaggitt commented 3 months ago
  1. after pulling from the master and rebuild, do you recall it downloading and re-compiling external packages?

No I didn't.

  1. What if you remove all downloaded packages from the tools/ directory and recompile?

No, I've recloned the entire Marabou repo and built from scratch again, downloading everything anew. Still segfaults in exactly the same way. I'll try to pinpoint the commit where this problem starts.

wu-haoze commented 3 months ago

@MatthewDaggitt This fix seems to resolve the issue you encountered: https://github.com/NeuralNetworkVerification/Marabou/pull/779/commits/f93fb3ebddf0e320828c728797148af6a08bc539

The issue is somehow indeed with the external dependency. Could you please try this fix and see if it fixes the issue locally?

UPDATE:

Please try this instead: https://github.com/NeuralNetworkVerification/Marabou/pull/779/commits/47e920b781b00cea884ee1cab4efe06a9baea61d

MatthewDaggitt commented 2 months ago

Hi @wu-haoze, unfortunately that doesn't fix the error for me. Even when I'm using 1.84 I still get the same error...

Trying 1.74 now.... Yup same problem with 1.74. So doesn't seem to be connected to boost for me.

MatthewDaggitt commented 2 months ago

Okay, so I've now tried building commit f9c12cab9c87d20bc1d57ea5b6f0a8d072a859f7 which I know was good, but that now fails...

So as you say @wu-haoze it must be something non-deterministic in our environments that have changed. You mentioned that the CI was failing? Do you have a link to the failed run?

MatthewDaggitt commented 2 months ago

I guess the next step would be actually to go in and find where the segfault is and why...

wu-haoze commented 2 months ago

@MatthewDaggitt This is the failed run: https://github.com/NeuralNetworkVerification/Marabou/actions/runs/8322539902/job/22770512318

wu-haoze commented 2 months ago

@MatthewDaggitt could this line be the culprit?

https://github.com/NeuralNetworkVerification/Marabou/blob/31eee10f610faeec7ffacb1600be15b940cb094d/tools/download_boost.sh#L25

The flag is still C++11. Could you please check whether changing it to c++17 and recompile boost would fix the problem?

MatthewDaggitt commented 2 months ago

No, unfortunately it doesn't....

GirardR1006 commented 2 days ago

Confirmed that the segfault in the test is still here (commit 3c8e1054ff10d4371525ff1e7e15f18430a2b6d2) while building locally on a Ubuntu 22.04 with g++ 11.4.0. Installed boost 1.84.

Note that rebuilding on a clean ubuntu docker image results on success for the whole test suite. To reproduce with docker:

docker run -it ubuntu:latest
apt install g++ wget git cmake python3 python3-dev
git clone https://github.com/NeuralNetworkVerification/Marabou.git
cd Marabou
mkdir build
cmake ../
cmake --build ..

Inside of this docker image, g++ is version 13.2.0.