Z3Prover / z3

The Z3 Theorem Prover
Other
10.36k stars 1.48k forks source link

iZ3 - Segfaults #243

Closed stieglma closed 8 years ago

stieglma commented 9 years ago

Hi,

This is the successor #64. We have still segfaults that occur when using Z3 for interpolation. Here are some logs (replaying them results in errors caused by invalid pointer).

http://students.fim.uni-passau.de/~stieglma/z3/z3_log_elevator_spec13_product23_true-unreach-call.log.gz http://students.fim.uni-passau.de/~stieglma/z3/z3_log_problem14_label50.log.gz http://students.fim.uni-passau.de/~stieglma/z3/z3_log_s3_srvr_8_true-unreach-call.log.gz

These logs were produced with CPAchecker in Revision 18008 and Z3, unstable-branch, revision 7b95d6894abcc09c9f3ccb70deae13bace2864f0

We are quite sure that these segfaults are not caused by wrong/too much dec_refs. To prove this I temporarily commented out all dec_refs and did an inc_ref on each returned value of a Z3 function call where possible, the results stayed the same.

If you need further information, do not hesitate to contact me.

NikolajBjorner commented 8 years ago

Would you be able to repro the logs against the latest version of Z3? The API has changed, so the logs will not work at this point (unless syncing with earlier versions).

stieglma commented 8 years ago

I will do the tests with the newest version of Z3 as soon as possible. Beforehand we have to fix our Java-Wrapper as it broke with the API changes.

stieglma commented 8 years ago

The segfaults still occur, and I think the interpolation support got even worse (we have more segfaults than before) after the changes that were done on Z3 lately. The newly generated logfiles can be found at http://students.fim.uni-passau.de/~stieglma/z3segfaults.zip

They were generated with CPAchecker in Revision 18388 and Z3, master-branch, revision b6b50655

NikolajBjorner commented 8 years ago

Somehow I can't replay the logs with my local version. Is there a different way to repro (1) do you get debug stacks you can attach, (2) do you have a Windows version of CPA checker I can replace a DLL?

stieglma commented 8 years ago

I have an Ubuntu 14.04 64bit, z3 was built by at first using the configure script and then running make. No changes were applied. If there is something I can enable, such as additional debug output I will do that, but besides the log I do not know other features. When I replay these logs, I get a segfault.

NikolajBjorner commented 8 years ago

For example, building Z3 with the debug flag enabled (python scripts/mk_make.py --debug) will add debug symbols, then under gdb or whatever your debugger is called you should be able to see a stack.

stieglma commented 8 years ago

I just tried to do that, but I was not able to compile z3 due to some errors. In the revision I used for creating the logs the last time (b6b5065) the error was:

z3/build/../src/shell/datalog_frontend.cpp:164: undefined reference to tout z3/build/../src/shell/datalog_frontend.cpp:164: undefined reference to tout z3/build/../src/shell/datalog_frontend.cpp:164: undefined reference to tout z3/build/../src/shell/datalog_frontend.cpp:164: undefined reference to tout z3/build/../src/shell/datalog_frontend.cpp:193: undefined reference to tout z3/build/../src/shell/datalog_frontend.cpp:193: more undefined references to tout follow collect2: error: ld returned 1 exit status

And in the current revision with a newly cloned repository (0e701138) the error is:

error CS2008: No files to compile were specified Compilation failed: 1 error(s), 0 warnings

Note that compiling z3 without --debug still works.

NikolajBjorner commented 8 years ago

The _TRACE define seems not included in your config.mk (see the CXXFLAGS define in your build directory. It should contain a definition for _TRACE. The python scripts are supposed to add this, so either there is a bug in them or you have a custom build environment.

stieglma commented 8 years ago

I have no custom environment (it was a completely new z3 checkout, the only thing I did was running the python script, and then make), the CXXFLAGS define looks as following:

CXXFLAGS= -D_MP_INTERNAL -DAMD64 -D_USE_THREAD_LOCAL -DZ3DEBUG -D_TRACE -fvisibility=hidden -c -mfpmath=sse -msse -msse2 -fopenmp -g -Wall -fno-str ict-aliasing -DLINUX -fPIC

markjin1990 commented 8 years ago

Similar errors when compiling Z3 with --debug. No "_TRACE" definition shown in config.mk. Any solution?

NikolajBjorner commented 8 years ago

I am at a loss to explain what you are observing: we have various build machines on different architectures and they do not have issues with "tout" errors. It should be possible from your build environments to verify the if-defs at the point of where the errors are reported. The if-def for _TRACE should be set whenever you refer to "tout". In particular, trace.h defines:

#ifdef _TRACE
extern std::ofstream tout; 
#define TRACE_CODE(CODE) { CODE } ((void) 0 )
#else
#define TRACE_CODE(CODE) ((void) 0)
#endif

and trace.cpp defines

ifdef _TRACE

std::ofstream tout(".z3-trace");

endif

The error you report indicates that "TRACE" is defined from "trace.h", but trace.cpp isn't compiled with the same flag.

This could of course be the case if you don't have a clean build. E.g., you use the same build directory without removing stale files from a different configuration.

wintersteiger commented 8 years ago

Has this issue been resolved and can we close it? It seems there are two different problems, first the debug compilation and then the interpolation segfaults. Did you manage to get the debug build working so that we can get some new log files?

stieglma commented 8 years ago

Sorry for the delay, I was finally able to compile z3 with the debug option enabled. The newly generated logfiles can be found at http://students.fim.uni-passau.de/~stieglma/z3segfaults_21.01.16.zip

markjin1990 commented 8 years ago

Same here. Problem solved. Thanks, everyone!

NikolajBjorner commented 8 years ago

@stieglma: what checking hash did you use for the latest log files. I will have to sync with the appropriate checkin to get the proper repro, sorry.

stieglma commented 8 years ago

@NikolajBjorner it was 993a0434b442c92e4e84103490722c5428be5ecd (the last commit before I wrote the update on 21st of January)

NikolajBjorner commented 8 years ago

Z3 crashed replaying the logs when it receives parameters that are null. After updating the interpolation interface to deal with null parameter pointers it does not crash.

stieglma commented 8 years ago

We still have the same problem, according to a colleague the amount of segmentation faults even increased.

I compiled the newest z3 version and tested one of the files I had used before, again. And also this time a segfault occured. The output when running the z3 log is:

Error at line 22205736: invalid pointer
max. heap size:     23.7902 Mbytes
time:               7.3152

The logfile can be downloaded at http://students.fim.uni-passau.de/~stieglma/z3_log_elevator_spec13_product23_true-unreach-call.log The git hash is 84cf208d5fa429b47c60b54401e5d51c7dfd46de

NikolajBjorner commented 8 years ago

The replayer implies that somehow you are passing in a pointer that wasn't created by Z3.

kenmcmil commented 8 years ago

Nikolaj, is it correct that you should never see a pointer in the log that wasn't returned by Z3? Can you tell me where is the header that gives the map from numbers to API functions?

kenmcmil commented 8 years ago

Oh, I think I see the problem. Z3_compute_interpolant has a parameter out_interp which is of type Z3_ast_vector *. In the log, this parameter is null. But I'm guessing that in fact that parameter has a value and the logging mechanism just doesn't handle reference parameters.

kenmcmil commented 8 years ago

It looks like the problem is that Z3_compute_interpolant should return using the macro RETURN_Z3_compute_interpolant. I'll fix that, but then unfortunately, we'll need to have a new log recorded.

stieglma commented 8 years ago

We realized today that perhaps get_interpolant is not the correct function for us (we are using the solver incrementally) and changed to compute_interpolant instead. This function however lacks documentation (http://z3prover.github.io/api/html/group__capi.html). So we are not sure if we are using it correctly. If you are interested in how we use it, you can look at our ticket and the referenced commit in there: https://github.com/sosy-lab/java-smt/issues/40

kenmcmil commented 8 years ago

Well, there was definitely a bug in the logging of Z3_compute_interpolant that was preventing the log from replaying. I have submitted a pull request, but meanwhile you can get the fix by pulling branch issue243 from fork kenmcmil. Would you mind re-running one of the examples that crashes on this version and posting the log?

stieglma commented 8 years ago

Sorry for the delay, the new log can be found at http://students.fim.uni-passau.de/%7Estieglma/z3_log_elevator_spec13_product23_true-unreach-call.log I used revision 8fc58e1ace19cac79d570cdbe92312f236ef644c to produce it. Replaying doesn't give an error anymore, but while running CPAchecker still a segfault occurs.

NikolajBjorner commented 8 years ago

I am not sure how we can solve this: yes, the log you shared goes through like a charm. There is of course the chance that it crashes while it tries to write the next log line. A debug stack could shed some additional light. It is typically a lot of trouble to set up large systems across platforms, so I tend to be reluctant to offer compiling other systems to debug, but let us know if you can think other way to identify the root cause(s).

stieglma commented 8 years ago

If it helps I could try to use CPAchecker under Windows with a Z3-Windows binary. (CPAchecker itself is platform independant, but we do only provide precompiled binaries for x86_64-linux). So if you have a Windows 10 Z3 binary which I could use for generating the information you need, this could be perhaps helpful.

NikolajBjorner commented 8 years ago

Before going through platform hoops, if you can create a stack trace (from a Z3 debug build) and share it. I guess you are using dlls, so it is a matter of switching the dlls (as long as the API doesn't change) for debugging the full system integration.

stieglma commented 8 years ago

I tried to get a stacktrace out the coredump and hope this is what you wanted:

Thread 1 (Thread 0x7f9ac4f79700 (LWP 21282)):
#0  _int_malloc (av=0x7f9abc000020, bytes=40) at malloc.c:3302
#1  0x00007f9ac48307b0 in __GI___libc_malloc (bytes=40) at malloc.c:2891
#2  0x00007f9ac30bddad in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007f9a6b3afffe in hash_space::hashtable<std::pair<ast_r, ast_r>, ast_r, hash_space::hash<ast_r>, hash_space::proj1<ast_r, ast_r>, hash_space::equal<ast_r> >::lookup (this=0x7f9abe522540, val=..., ins=true) at ../src/interp/iz3hash.h:323
#4  0x00007f9a6b3afaba in hash_space::hashtable<std::pair<ast_r, ast_r>, ast_r, hash_space::hash<ast_r>, hash_space::proj1<ast_r, ast_r>, hash_space::equal<ast_r> >::insert (this=0x7f9abe522540, val=...) at ../src/interp/iz3hash.h:352
#5  0x00007f9a6b3c2e94 in iz3translation_full::translate_main (this=0x7f9abe522390, proof=..., expect_clause=true)
    at ../src/interp/iz3translate.cpp:1686
#6  0x00007f9a6b3b6698 in iz3translation_full::translate_ur (this=0x7f9abe522390, proof=...)
    at ../src/interp/iz3translate.cpp:871
#7  0x00007f9a6b3c3242 in iz3translation_full::translate_main (this=0x7f9abe522390, proof=..., expect_clause=false)
    at ../src/interp/iz3translate.cpp:1713
#8  0x00007f9a6b3b69c7 in iz3translation_full::translate_ur (this=0x7f9abe522390, proof=...)
    at ../src/interp/iz3translate.cpp:884
#9  0x00007f9a6b3c3242 in iz3translation_full::translate_main (this=0x7f9abe522390, proof=..., expect_clause=false)
    at ../src/interp/iz3translate.cpp:1713
#10 0x00007f9a6b3b69c7 in iz3translation_full::translate_ur (this=0x7f9abe522390, proof=...)
    at ../src/interp/iz3translate.cpp:884
#11 0x00007f9a6b3c3242 in iz3translation_full::translate_main (this=0x7f9abe522390, proof=..., expect_clause=false)
    at ../src/interp/iz3translate.cpp:1713
#12 0x00007f9a6b3b69c7 in iz3translation_full::translate_ur (this=0x7f9abe522390, proof=...)
    at ../src/interp/iz3translate.cpp:884
#13 0x00007f9a6b3c3242 in iz3translation_full::translate_main (this=0x7f9abe522390, proof=..., expect_clause=false)
    at ../src/interp/iz3translate.cpp:1713
#14 0x00007f9a6b3b69c7 in iz3translation_full::translate_ur (this=0x7f9abe522390, proof=...)
    at ../src/interp/iz3translate.cpp:884
#15 0x00007f9a6b3c3242 in iz3translation_full::translate_main (this=0x7f9abe522390, proof=..., expect_clause=false)
    at ../src/interp/iz3translate.cpp:1713
#16 0x00007f9a6b3b69c7 in iz3translation_full::translate_ur (this=0x7f9abe522390, proof=...)
    at ../src/interp/iz3translate.cpp:884

the above part is the same for over 500 times of the stack, then the following part comes:

#553 0x00007f9a6b3c3242 in iz3translation_full::translate_main (this=0x7f9abe522390, proof=..., expect_clause=false)
    at ../src/interp/iz3translate.cpp:1713
#554 0x00007f9a6b3b69c7 in iz3translation_full::translate_ur (this=0x7f9abe522390, proof=...)
    at ../src/interp/iz3translate.cpp:884
#555 0x00007f9a6b3c3242 in iz3translation_full::translate_main (this=0x7f9abe522390, proof=..., expect_clause=true)
    at ../src/interp/iz3translate.cpp:1713
#556 0x00007f9a6b3c8c7f in iz3translation_full::translate (this=0x7f9abe522390, proof=..., dst=...)
    at ../src/interp/iz3translate.cpp:2052
#557 0x00007f9a6b3974bd in iz3interp::proof_to_interpolant (this=0x7f9ac4f77a40, proof=..., 
    cnsts=std::vector of length 3, capacity 3 = {...}, parents=std::vector of length 3, capacity 4 = {...}, 
    interps=std::vector of length 0, capacity 0, theory=std::vector of length 0, capacity 0, options=0x0)
    at ../src/interp/iz3interp.cpp:323
#558 0x00007f9a6b397b0a in iz3interp::proof_to_interpolant (this=0x7f9ac4f77a40, proof=..., 
    cnsts=std::vector of length 3, capacity 4 = {...}, parents=std::vector of length 3, capacity 4 = {...}, 
    interps=std::vector of length 0, capacity 0, theory=std::vector of length 0, capacity 0, options=0x0)
    at ../src/interp/iz3interp.cpp:361
#559 0x00007f9a6b397c9b in iz3interp::proof_to_interpolant (this=0x7f9ac4f77a40, proof=..., 
    _cnsts=std::vector of length 3, capacity 4 = {...}, tree=..., interps=std::vector of length 0, capacity 0, options=0x0)
    at ../src/interp/iz3interp.cpp:379
#560 0x00007f9a6b394119 in iz3interpolate (_m_manager=..., proof=0x7f9abe474278, cnsts=..., tree=0x7f9abe473ca8, interps=..., 
    options=0x0) at ../src/interp/iz3interp.cpp:523
#561 0x00007f9a6aa12e44 in Z3_get_interpolant (c=0x7f9abc742b68, pf=0x7f9abe474278, pat=0x7f9abe473ca8, p=0x7f9abd902348)
    at ../src/api/api_interp.cpp:232
#562 0x00007f9a90fc8197 in Java_org_sosy_1lab_solver_z3_Z3NativeApi_get_1interpolant ()
   from /localhome/stieglma/workspace/CPAcheckerTrunk_Z3/lib/native/x86_64-linux/libz3j.so
#563 0x00007f9ab9012d98 in ?? ()
#564 0x00000007799ca610 in ?? ()
#565 0x00007f9abc742b68 in ?? ()
#566 0x00007f9abd902348 in ?? ()
#567 0x00007f9a90fb1803 in Java_org_sosy_1lab_solver_z3_Z3NativeApi_inc_1ref ()
   from /localhome/stieglma/workspace/CPAcheckerTrunk_Z3/lib/native/x86_64-linux/libz3j.so
NikolajBjorner commented 8 years ago

Seems like a stack-overflow, simply due to running out of stack space (I currently don't see other reasons). Other than rewriting the code, if you set ulimit, perhaps it may mitigate some cases.

stieglma commented 8 years ago

I tried with ulimit -s unlimited but it didn't help. When additionally raising the -Xss parameter given to the java commandline it works (we usually have 1024k, I just added a 0), and the program can be verified. As a temporary solution this will work but we would appreciate it if you could rewrite the recursive could so that the stack-overflow doesn't occur anymore.

kenmcmil commented 8 years ago

I can see that it's a pretty serious problem if you're running iZ3 under java. Unfortunately, the method iz3translate_main that's causing the problem has a pretty complicated recursion pattern with about a dozen recursive calls. If you tried to refactor this into an iterative routine, you'd almost certainly introduce subtle bugs (subtle because some of the cases occur rarely and are very difficult to stimulate). I'm going to mark this issue closed, as I think it's impractical to fix.

stieglma commented 8 years ago

I do not really often use C++ but from what I have in mind, wouldn't it be possible to inline translate_ur? This should shrink the stack_size at least by half. And from what I see this method is only called once (in translate_main) so there shouldn't also be any other issues with it.

cheshire commented 8 years ago

Hi,

I can see that it's a pretty serious problem if you're running iZ3 under java

Java does want more memory, and does provide a rather small limit for stack by default. However, we have segmentation faults for finding interpolants even when the application gets 100M of stack space. Thus I don't think it's Java-specific.