hangs in clojure - Githubissues

mobileink commented 3 years ago

Hi, I'm trying to get re-graph working in a clojure command-line app. Following the examples, nothing happens, and the program will not exit, even if I add (shutdown-agents) at the end. E.g.

(defn -main
  [& args]

  (println "entering main")
  (re-graph/init {:ws-url   "ws://localhost:3086/graphql-ws"
                  :http-url "http://localhost:3085/graphql"})
  (println "exiting main")
  (shutdown-agents))

Running it:

$ clojure -M -m mina.repl
entering main
exiting main
...hangs...

Adding the example code to send a query has no effect. What am I doing wrong?

oliyh commented 3 years ago

Hi,

You should call re-graph/destroy when you have finished with it. I wouldn't expect it to hang forever without this though. Which http client are you using, hato (default) or gniazdo? (See readme for details).

Cheers

mobileink commented 3 years ago

There's nothing in the readme about re-graph/destroy, and I don't see any other documentation. I'm using hato.

oliyh commented 3 years ago

Yes I realised when I replied that destroy is undocumented, I have added that.

I will try to replicate your issue, in the meantime are you able to take a thread dump while it is hanging? That would help greatly.

Thanks Oliy

oliyh commented 2 years ago

Closing this as no update, please let me know if you still face issues.

Cheers

oliyh commented 2 years ago

Was able to reproduce and found this non-daemon thread preventing shutdown:

"pool-2-thread-1" #16 prio=5 os_prio=0 cpu=235.02ms elapsed=41.72s tid=0x00007f11524f3000 nid=0x67fb4 waiting on condition  [0x00007f11251fc000]
   java.lang.Thread.State: WAITING (parking)
    at jdk.internal.misc.Unsafe.park(java.base@11.0.15/Native Method)
    - parking to wait for  <0x000000070c400170> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(java.base@11.0.15/LockSupport.java:194)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.15/AbstractQueuedSynchronizer.java:2081)
    at java.util.concurrent.LinkedBlockingQueue.take(java.base@11.0.15/LinkedBlockingQueue.java:433)
    at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@11.0.15/ThreadPoolExecutor.java:1054)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.15/ThreadPoolExecutor.java:1114)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.15/ThreadPoolExecutor.java:628)
    at java.lang.Thread.run(java.base@11.0.15/Thread.java:829)

Turns out it was the executor from here: https://github.com/day8/re-frame/blob/master/src/re_frame/interop.clj#L26

Going to talk to re-frame devs to find out more. Short term workaround is to kill that executor.

lowecg commented 2 weeks ago

@oliyh Hi Oli - did you ever get a resolution to this?

Running clj -X:test from the CLI hangs at the end of a test. A thread dump from VisualVM reveals a stack similar to the one above.

"pool-1-thread-1" #31 [40707] prio=5 os_prio=31 cpu=16.75ms elapsed=487.53s tid=0x0000000126bd4200 nid=40707 waiting on condition  [0x000000016f2e2000]
   java.lang.Thread.State: WAITING (parking)
        at jdk.internal.misc.Unsafe.park(java.base@21.0.5/Native Method)
        - parking to wait for  <0x000000040105d1c0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(java.base@21.0.5/LockSupport.java:371)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(java.base@21.0.5/AbstractQueuedSynchronizer.java:519)
        at java.util.concurrent.ForkJoinPool.unmanagedBlock(java.base@21.0.5/ForkJoinPool.java:3780)
        at java.util.concurrent.ForkJoinPool.managedBlock(java.base@21.0.5/ForkJoinPool.java:3725)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@21.0.5/AbstractQueuedSynchronizer.java:1712)
        at java.util.concurrent.LinkedBlockingQueue.take(java.base@21.0.5/LinkedBlockingQueue.java:435)
        at java.util.concurrent.ThreadPoolExecutor.getTask(java.base@21.0.5/ThreadPoolExecutor.java:1070)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@21.0.5/ThreadPoolExecutor.java:1130)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@21.0.5/ThreadPoolExecutor.java:642)
        at java.lang.Thread.runWith(java.base@21.0.5/Thread.java:1596)
        at java.lang.Thread.run(java.base@21.0.5/Thread.java:1583)

   Locked ownable synchronizers:
        - None

lowecg commented 2 weeks ago

For now, I have a what can be charitably called a "workaround":

(:require 
    [clojure.test :refer [is, testing, deftest, use-fixtures]
    [re-frame.interop])

(def get-executor
  (let [executor-var (resolve 're-frame.interop/executor)]
    #(var-get executor-var)))

(defn shutdown-executor []
  (.shutdown (get-executor)))

(use-fixtures :once
              (fn [f]
                (f)  ; runs all tests
                (shopify-api/shutdown-executor)))

oliyh commented 2 weeks ago

Hi @lowecg,

Well done! This would probably be a great contribution to re-frame-test - if they accept it then could update martian to use it. If not then I'm also happy to accept this into martian as a test fixture.

Cheers

oliyh commented 1 week ago

Hi @lowecg

Did you get a chance to discuss this with the re-frame-test maintainers? I could open an issue and point them to this thread if not.

Cheers

lowecg commented 1 week ago

I haven't yet (been limited on time). If you could open the thread, that would be a big help. Thank you

On Thu, 14 Nov 2024 at 22:03, Oliver Hine @.***> wrote:

Hi @lowecg https://github.com/lowecg

Did you get a chance to discuss this with the re-frame-test maintainers? I could open an issue and point them to this thread if not.

Cheers

— Reply to this email directly, view it on GitHub https://github.com/oliyh/re-graph/issues/75#issuecomment-2477496904, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHCVNQLC3MPSVDYZAZKESL2AUM4FAVCNFSM6AAAAABRAWZ2MCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZXGQ4TMOJQGQ . You are receiving this because you were mentioned.Message ID: @.***>

lowecg commented 1 week ago

There is another angle to this functionality where I had to resurrect the executor.

Problem: When using re-graph from a Lambda function, I needed to shutdown the executor to prevent process hang; but Lambdas are often reused and, once the executor has been shutdown, reuse is not possible.

Solution: Reset the executor. Call reset at the start of the lambda and shutdown at the end, and that seems to allow this use case.

(defn reset-executor!
  "Resets the executor used by re-frame (used by re-graph for GraphQL execution).

   This allows repeat executions of Lambda"
  []
  (let [current-executor (get-executor)]
    (when (and current-executor (.isShutdown ^ExecutorService current-executor))
      (alter-var-root #'re-frame.interop/executor
                      (fn [_] (Executors/newSingleThreadExecutor))))))

(defn shutdown-executor
  "Shuts down the executor used by re-frame (used by re-graph for GraphQL execution).

   Warning: once called, further GraphQL interactions will not be possible unless you call reset-executor!.

   This is necessary as the non-daemon executor is not shut down by default and will cause short-lived processes (e.g. Lambda or tests run from the CLI) to hang."
  []
  (.shutdown ^ExecutorService (get-executor)))

lowecg commented 5 days ago

I've raised a PR for this with re-frame, so 🤞

oliyh / re-graph

hangs in clojure #75