Z3: dispatch solving process into multiple threads

computereasy commented 7 years ago

Hello,

I am struggling in solving multiple large size constraints along a symbolic execution, and my tentative test shows that each solving time takes quite long time.

An intuitive thinking is to dispatch the solving step into a thread pool, and do not block my main thread of symbolic execution. I read posts here and here. And tentatively implement the following Scala code:

val pool = java.util.concurrent.Executors.newFixedThreadPool(24)

def solve_multhread(s0: Solver, addr: Long): Unit = {
 val r = new Runnable {
   def run {
    val cfg_thread = new JH[String, String]()
    // cfg_thread.put("proof", "true")
    val ctx_thread: Context = new Context()
    // logger.debug(addr)              <---------  if just use this line of code, it works fine.
   }
   pool.execute(r)
  }
}

Surprisingly, every time when it create the new context object (val ctx_thread: Context = ...), I will get an exception like:

java(36936,0x70000021a000) malloc: *** error for object 0x7fdbee324ef0: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug

I suppose the above procedure should be legal, because when I only use a logger utility to do the logging within multiple threads, it works fine (logger.debug(addr)).

Am I doing anything wrong here? By looking at the examples on the stack overflow, it seems that I should create a new context within each thread and use it to init a new solver..

wintersteiger commented 7 years ago

You're correct Context objects are not thread-safe, so you need a separate one for each thread. This includes the creation of Context objects as well, i.e., you have to create them before forking off any threads. For the higher-level APIs, like the Java API, we had a lock ('synchronized' block) in the constructors so this would be done automatically, but that lock has been removed in 27aa37946e56a4d9722de416791a2e40dc5d2054. I have now added it back.

@cheshire: what was your reason for removing the Context creation lock?

computereasy commented 7 years ago

Hi @wintersteiger , thank you so much for your reply. I tried to create context object at the main thread before symbolic execution, and each satisfiability checking will (in the main thread) acquire one created context, use it to init one solver (val s: Solver = s0.translate(m_ctx)), and then leverage one thread in the thread pool to do the checking.

However, I still get the error:

 [thread 24079 also had an error]#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV
(0xb) at pc=0x0000000126729117, pid=42016, tid=0x0000000000001703
#
# JRE version: Java(TM) SE Runtime Environment (8.0_101-b13) (build 1.8.0_101-b13)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.101-b13 mixed mode bsd-amd64  compressed oops)
# Problematic frame:
# C  [libz3.dylib+0xd48117]  _ZN22small_object_allocator8allocateEm+0x57
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/hs_err_pid42016.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
 # The crash happened outside the Java Virtual Machine in native code.
 # See problematic frame for where to report the bug.
#

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

Note that it is not immediately failed. Actually when debugging, I do observe it successfully solved some satisfiable/unsatisfiable solutions... But after a while, it just crash.

wintersteiger commented 7 years ago

Could you give us a self-contained example so we can reproduce the bug? The Java failure message doesn't tell us much, but Java usually creates a longer report in a file, usually called something like hs_err_pid[number].log.

However, I think I can spot the problem: calling Solver.translate from the child threads is not thread-safe either, if I'm not mistaken. Try putting a synchronized {} section around it.

cheshire commented 7 years ago

Hi @wintersteiger, I think you've rushed with f1a704484b56555311b52dad2ab4146a5577f257 a bit: this was discussed at length with @NikolajBjorner, and from what I recall we have reached the conclusion that the lock is redundant (cf comments to https://github.com/Z3Prover/z3/pull/648). I don't actually recall why we have decided it was redundant: it was either due to the fact that C++ code already locks, or because different contexts were totally independent.

wintersteiger commented 7 years ago

At the time we only said that this should have been fixed, but I'm not sure it ever was fixed. The .NET API still contains the lock and the Java API still contained the object but not the synchronized {...} blocks. The two APIs are really very similar in that respect, so we should really fix both or be conservative in both. Contexts are definitely not totally independent (e.g., symbol creation), but some parts are indeed locked internally (e.g. memory manager), but I still don't have a good test program for these issues (still working on compiling your old one).

wintersteiger commented 7 years ago

Ah! Right, I don't have your Fuzzer class so I can't run your example. Could you perhaps set up a fully self-contained example that we can use as a test case for this type of issue in the future?

computereasy commented 7 years ago

@wintersteiger Thank you for your reply. The translate function should always be in the main thread, if I was correct. Here is the code I wrote for this. It is not very easy but I will try to build a self-contained code to re-produce the bug. As for the java log file, you can find it here.

val pool = java.util.concurrent.Executors.newFixedThreadPool(N,
  Executors.defaultThreadFactory())

val cfg_thread = new JH[String, String]()
cfg_thread.put("proof", "true")

class Worker(addr: Addr, index: Int, s0: Solver, fid: Int, pcl: Int) extends Runnable {
  val m_ctx = new Context(cfg_thread)
  val m_s = s0.translate(m_ctx)
  val m_addr = addr.toHexString
  val m_fid = fid
  val m_pcl = pcl
  val m_index = index

  def run () {
    logger.debug(m_index.toString)

     m_s.check() match {
       case Status.SATISFIABLE   =>
       logger.info("sat   : " + m_addr + " : " + m_pcl)
      case Status.UNSATISFIABLE =>
        logger.info("unsat :" + m_addr + " : " + m_pcl)
      case _                    =>
        throw new Exception("undefined check results")
    }
  }
}

def solve_multhread(addr: Addr, index: Int, s0: Solver, fid: Int, pcl: Int): Unit = {
  val w = new Worker(addr, index, s0, fid, pcl)
  pool.execute(w)
}

NikolajBjorner commented 7 years ago

You are calling "s0.translate(m_ctx)" in the worker thread. This means that you are referencing a solver object s0 from potentially different threads at the same time. This is not thread safe. Make sure to only touch derivatives from a context in a thread-safe way. A derivative of a context is any object created from it, except from the translate objects. For example, a Solver object is created with respect to a context, expressions are created with respect to a context. They share state with the context object and this state cannot be accessed from different threads.

Z3Prover / z3

Z3: dispatch solving process into multiple threads #819