puniverse / pulsar

Fibers, Channels and Actors for Clojure
http://docs.paralleluniverse.co/pulsar/
Other
911 stars 53 forks source link

NullPointerException on ReentrantLock in fiber #33

Closed dehubbed closed 9 years ago

dehubbed commented 9 years ago

When the second fiber in this example tries to lock the lock, there is a NullPointerException.

(ns lock-test
  (:import co.paralleluniverse.strands.concurrent.ReentrantLock)
  (:use clojure.tools.logging
        [co.paralleluniverse.pulsar.core :only [defsfn promise spawn-fiber]]))

(defsfn blocking [p lock]
  (.lockInterruptibly lock)
  (try
    (debug "locked")
    (deref p)
    (finally
      (.unlock lock)
      (debug "unlocked"))))

(def p (promise))

(def l (ReentrantLock.))

(def f1 (spawn-fiber (fn [] (debug (blocking p l)))))

(Thread/sleep 1000)

(def f2 (spawn-fiber (fn [] (debug (blocking p l)))))

yields:

Exception in Fiber "fiber-10000002" If this exception looks strange, perhaps you've forgotten to instrument a blocking method. Run your program with -Dco.paralleluniverse.fibers.verifyInstrumentation to catch the culprit!
java.lang.NullPointerException
    at co.paralleluniverse.strands.Strand.park(Strand.java:485)
    at co.paralleluniverse.strands.concurrent.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838)
    at co.paralleluniverse.strands.concurrent.AbstractQueuedSynchronizer.doAcquireInterruptibly(AbstractQueuedSynchronizer.java:899)
    at co.paralleluniverse.strands.concurrent.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1231)
    at co.paralleluniverse.strands.concurrent.ReentrantLock.lockInterruptibly(ReentrantLock.java:339)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
    at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313)
    at lock_test$blocking.invoke(form-init7720836874273720665.clj:2)
    at co.paralleluniverse.pulsar.InstrumentedIFn.invoke(InstrumentedIFn.java:40)
    at lock_test$fn__2075$fn__2076.invoke(form-init7720836874273720665.clj:1)
    at co.paralleluniverse.pulsar.InstrumentedIFn.invoke(InstrumentedIFn.java:32)
    at co.paralleluniverse.pulsar.ClojureHelper.suspendableInvoke(ClojureHelper.java:185)
    at co.paralleluniverse.pulsar.ClojureHelper$3.run(ClojureHelper.java:172)
    at co.paralleluniverse.fibers.Fiber.run(Fiber.java:1003)

Turning on verifiyInstrumentation doesn't show anything and anyways I don't think it's an instrumentation problem. The line of code that NPE's curiously can't throw an NPE in theory:

            Fiber.park(blocker);

I traced it in the debugger and blocker is not null, not that it would matter. Must be something at the byte code level that's screwy.

circlespainter commented 9 years ago

You have a undelivered promise deref inside the mutex block; if I comment it out your snippet will work for me, anyway looking deeper into it in order to better understand the exception you get.

dehubbed commented 9 years ago

The deref is there to create an artificial blocking situation. Of course it works if you take it out, then the second lock attempt will happen after the first has been released only.

circlespainter commented 9 years ago

Yes, the exception happens even with a limited Fiber/sleep 2000 ms.

circlespainter commented 9 years ago

Can you try with 0.6.3-SNAPSHOT? You'll need to clone and lein install manually.

dehubbed commented 9 years ago

Give me 2hrs.

dehubbed commented 9 years ago

Works well, thanks for the quick fix. Looks like it was an instrumentation issue after all, albeit a clojure internal one of sorts. May I ask how you found out? The -Dco.paralleluniverse.fibers.verifyInstrumentation=true switch didn't produce any output for me.

circlespainter commented 9 years ago

Good, glad it works for you too. I could spot it in the report about uninstrumented frames indeed, but only after I included the Fiber/sleep call. Instrumentation verification is performed by fibers. Any (potentially) fiber-blocking call should do.

dehubbed commented 9 years ago

Thanks I'm not getting this report in spite of the system property being set and a warning at startup that verification may harm performance. I assume the 'report' is some dump on the console? I'll keep an eye out, it seems to be essential for quasar/pulsar work.

circlespainter commented 9 years ago

Yes it's a stack dump and it usually comes before strange exceptions like this one. It marks with an "*" the stack frames that miss instrumentation and it's quite visible. But if there's no fiber-blocking call on the stack the verification has no chances to be performed, so in case of missing instrumentation of "internal" runtime methods that are not blocking per-se (but can call blocking methods), like this one, the verification will be performed only when another suspendable method is being called. It shouldn't affect normal user code though, as "user" methods will be declared suspendable because they can suspend in their body (or because they call other suspendable methods which can, in turn, suspend).

dehubbed commented 9 years ago

Ok I had the Fiber/sleep which is a blocking call and didn't get this stack trace. Also the deref on the promise is normally a fiber blocking call too, and should do the same IMHO. Anyways probably not worth pursuing this now that the problem is fixed. Thanks again.

circlespainter commented 9 years ago

True about the deref, it should end up calling a fiber-blocking function on a Quasar promise. I'll think a bit more about it and I'll consider opening a Quasar issue to investigate this further if I don't come to any further understanding.