leonoel / missionary

A functional effect and streaming system for Clojure/Script
Eclipse Public License 2.0
630 stars 26 forks source link

NullPointerException when running a task "outside a process block" inside another task #35

Open mjmeintjes opened 3 years ago

mjmeintjes commented 3 years ago
 (m/?
   (m/sp
    (let [f (fn [] (m/? (m/sleep 1000 1)))]
      (f))))

gives:

1. Unhandled java.lang.NullPointerException
   (No message)

                 impl.cljc:   60  cloroutine.impl$coroutine$fn__66068/invoke
           Sequential.java:   49  missionary.impl.Sequential/step
           Sequential.java:   32  missionary.impl.Sequential$1/invoke
                Sleep.java:   61  missionary.impl.Sleep$Scheduler/trigger
                Sleep.java:   75  missionary.impl.Sleep$Scheduler/run

but

  (m/?
   (m/sp
    (let [f (fn [] (future (m/? (m/sleep 1000 1))))]
      @(f))))

gives 1.

I'm not sure if this is expected behaviour or if it is a bug, but I generally see all NullPointerExceptions as bugs, so thought I'd submit it.

leonoel commented 3 years ago

The first snippet is unsupported due to host limitations. It is not currently possible to capture the continuation from an arbitrary point of the call stack at runtime, therefore we have to emulate it with compile-time syntactic transformation techniques (cloroutine, in this case). As a consequence, any call to ? from a sp block is forbidden unless it's called directly from its body. The code rewriting stops at lambdas, so the ? is considered foreign to the sp block here.

Project Loom is a promising attempt to provide runtime continuations on the JVM so it's not impossible it eventually works at some point but for now there's still way too much uncertainty about it to be a viable option.

The second snippet works as expected because ? is not called from a sp block so the future's thread sees it as a blocking call.

I'm not sure if this is expected behaviour or if it is a bug, but I generally see all NullPointerExceptions as bugs, so thought I'd submit it.

You did well ! Even though an error is expected, the error is confusing, confusion means poor developer experience. I think the right path forward is to add a check and raise an error with a more explicit message. I also think the documentation could be more clear about that so feel free to suggest improvements.

mjmeintjes commented 2 years ago

One problem that I have been running into a few times that is related to this issue: if I use missionary in some internal code (ie a separate library or namespace), and then turn it into a blocking call (to "hide" the use of missionary behind the api methods), I cannot then use that api within another missionary block.

  (defn get-value []
    (m/? (m/sp :internal-use-of-missionary)))
  (m/?  (m/sp (get-value))))

This throws an nullreference exception, and also the return value of get-value is a java.lang.ThreadLocal$SuppliedThreadLocal.

The fix for this seems to be:

  (defn get-value []
    @(future  (m/? (m/sp :internal-use-of-missionary))))
  (m/?  (m/sp (get-value))))

But it can be difficult to diagnose this, especially if you don't know that the library you are using uses missionary underneath.

Please let me know if this does not make sense and I'll try to explain better.

A proposed fix would probably just be to throw an better exception in this case, or maybe to wrap the m/? in a future by default.

leonoel commented 2 years ago

That makes sense. get-value is a blocking call therefore it's wrong to call it directly from m/sp, and the error message should explain that. The right fix is to turn blocking calls into tasks using m/via.

(defn get-value []
  (m/? (m/sp :internal-use-of-missionary)))
(m/? (m/sp (m/? (m/via m/blk (get-value)))))