rogerallen / hello_lwjgl

simple lwjgl clojure example
60 stars 7 forks source link

XstartOnFirstThread causes problems with cider-repl integration [Mac OS X] #6

Closed antoinevg closed 8 years ago

antoinevg commented 8 years ago

New to Clojure and it's been decades since I last worked with Java so apologies in advance for a slightly garbled issue! :)

Let me start with what I would like:

All very well and fine but basically:

I can run the sample code from lein by passing -XstartOnFirstThread in :jvmopts

BUT

If I try to M-x cider-jack-in with -XstartOnFirstThread present emacs never manages to boot a repl, hanging instead.

Conversely, if I remove -XstartOnFirstThread, I can boot the cider repl but then end up with:

java.lang.IllegalStateException: GLFW windows may only be created 
on the main thread and that thread must be the first thread in the process. 
Please run the JVM with -XstartOnFirstThread.

…when I try to evaluate the hello-lwjgl buffer.

rogerallen commented 8 years ago

Thanks for the report and I agree the -XstartOnFirstThread causes a problem with the repl that I don't know how to solve.

Googling finds this: http://stackoverflow.com/questions/14046952/using-swt-with-lein-repl-on-mac-os-x but I wasn't able to get this working.

Maybe someone out there would like to help?

For once, it "just works" on a PC, but not on a Mac...

rogerallen commented 8 years ago

I can at least get a repl up via telnet (but not emacs) via:

  1. adjusting JVM-OPTS in project.clj
   :macosx   ["-XstartOnFirstThread"
              "-Dclojure.server.repl={:port 5555 :accept clojure.core.server/repl}"]
  1. lein run
  2. [in another terminal] telnet 127.0.0.1 5555

Here's a way to reset the timer on the rotating triangle to show affecting the code live...

Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
user=> (load "hello_lwjgl/alpha")
nil
user=> (in-ns 'hello-lwjgl.alpha)
#object[clojure.lang.Namespace 0x20bc8da2 "hello-lwjgl.alpha"]
hello-lwjgl.alpha=> (swap! globals assoc :angle 0.0)
{:errorCallback #object[org.lwjgl.glfw.GLFWErrorCallback$3 0x4073da5b "org.lwjgl.glfw.GLFWErrorCallback$3@4073da5b"], :keyCallback #object[hello_lwjgl.alpha.proxy$org.lwjgl.glfw.GLFWKeyCallback$ff19274a 0x6ab596ee "hello_lwjgl.alpha.proxy$org.lwjgl.glfw.GLFWKeyCallback$ff19274a@6ab596ee"], :window 140412435571360, :width 800, :height 600, :title "alpha", :angle 0.0, :last-time 1459227374462}
antoinevg commented 8 years ago

Thank you Roger, that's definitely a step forward and it helps to know this is a real issue!

I'll dig a bit more on my side and see what I can come up with :-)

antoinevg commented 8 years ago

Okay… I've traced it all the way down to glfw.

The root cause of the issues with OSX is that the Cocoa event processing loop can only be accessed from the primary thread: https://github.com/glfw/glfw/issues/136

It doesn't look like this fact of life can be changed without Apple's input so the next question, I guess, is:

How to persuade lein to start cider's repl interface (nrepl? still wrapping my head on how the pieces fit together) on a secondary thread rather than trying to start it on the main thread (which is, of course, blocked by lwjgl) ? :-)

The clojure.server.repl you set up is executing on a separate thread so it looks like - in theory - this should be possible. (With the caveat, of course, that you'll probably crash everything if you try to execute GL code from the REPL as it's not on the main thread but I could probably live with this)

I'll continue digging…

rogerallen commented 8 years ago

If you're motivated, there is some code at the bottom of http://cider.readthedocs.org/en/latest/installation/#ciders-nrepl-middleware that shows code for "Using embedded nREPL server".

Perhaps you could try adding code to launch that via "lein run nrepl"?

rogerallen commented 8 years ago

I gave nrepl embedding a try and I must be doing something wrong...

Here's what core.clj looks like:

(ns hello-lwjgl.core
  (:require [hello-lwjgl.alpha :as alpha]
            [hello-lwjgl.beta  :as beta]
            [hello-lwjgl.gamma :as gamma]
            ;;[hello-lwjgl.omega :as omega]
            [clojure.tools.nrepl.server :as nrepl-server]
            [cider.nrepl :refer (cider-nrepl-handler)]
            )
  (:import (org.lwjgl Version))
  (:gen-class))

;; ======================================================================
(defn start-nrepl
  []
  (println "Starting Cider Nrepl Server Port 7888")
  (nrepl-server/start-server :port 7888 :handler cider-nrepl-handler))

;; ======================================================================
(defn -main
  [& args]
  (println "Hello, Lightweight Java Game Library! V" (Version/getVersion))
  (cond
   (= "alpha" (first args)) (alpha/main)
   (= "beta"  (first args)) (beta/main)
   (= "gamma" (first args)) (gamma/main)
   ;;(= "omega" (first args)) (omega/main)
   (= "nrepl" (first args)) (start-nrepl)
   :else (alpha/main)))

(also added [cider/cider-nrepl "0.11.0"] as a dependency to project.clj

starting via:

> lein run nrepl
Hello, Lightweight Java Game Library! V 3.0.0b SNAPSHOT
Starting Cider Nrepl Server Port 7888

and M-x cider-connect 127.0.0.1 7888 just results in

nREPL: Establishing direct connection to 127.0.0.1:7888 ...
nREPL: Direct connection failed

:cry:

antoinevg commented 8 years ago

I'm seeing some interesting behaviours with this code.

If, instead of lein run nrepl you boot a demo with lein run alpha:

Looking closely, you'll see the demo's window is not receiving events - when you select it the kill/minimize/maximize buttons stay greyed out and it does not shutdown when hitting the ESC key.

This tells me that something else is either grabbing the OSX event loop or blocking the lwjgl thread.

A bit more investigation shows that this only happens when you require:

[cider.nrepl :refer (cider-nrepl-handler)]

If you comment this out the window event handling returns to normal.

So obviously when it is loaded, cider.nrepl calls a function which blocks the thread it was loaded in… and because the JVM has been started with -XstartOnFirstThread this is now blocking the primary JVM thread that LWJGL (and the rest of the Clojure runtime for that matter) is trying to use.

This now also makes more sense for me why we're not having problems with a clojure.server.repl={:port 5555 :accept clojure.core.server/repl} REPL as it's not blocking the calling thread.

Next step… dig into https://github.com/clojure-emacs/cider-nrepl and figure out what it's doing!

Also, check this out:

(ns hello_lwjgl.core
  (:require [hello-lwjgl.alpha :as alpha]
            [hello-lwjgl.beta  :as beta]
            [hello-lwjgl.gamma :as gamma]
            ;;[hello-lwjgl.omega :as omega]                                                              
            [clojure.tools.nrepl.server :as nrepl-server]
            ;;[cider.nrepl :refer (cider-nrepl-handler)]                                           
            )
  (:import (org.lwjgl Version))
  (:gen-class))

;; ======================================================================                          
(defn start-nrepl
  []
  (.start (Thread. (fn []
                     (println "Starting Cider Nrepl Server Port 7888")
                     (require '[cider.nrepl :refer (cider-nrepl-handler)])
                     (println "loaded")
                     (Thread/sleep 2000)
                     (println "slept")
                     ;;(nrepl-server/start-server :port 7888 :handler cider-nrepl-handler)))))     
                     ))))

I'm running the require on a separate thread and it still manages to lock up lein - basically, you'd expect the thread to exit after "slept" but something in cider.nrepl is still keeping it running!

antoinevg commented 8 years ago

Bonus fun:

(defn start-nrepl
  []
  (.start (Thread. (fn []
                     (println "Starting Cider Nrepl Server Port 7888")
                     (load-string (str "(require '[clojure.tools.nrepl.server :as nrepl-server])"
                                       "(require '[cider.nrepl :as cider])"
                                       "(nrepl-server/start-server :port 7888 :handler                   
cider/cider-nrepl-handler)"))
                     (println "started")
                     ;;(nrepl-server/start-server :port 7888 :handler cider-nrepl-handler)))))           
                     ))))
rogerallen commented 8 years ago

Any luck with this? According to the cider-nrepl site, we might have luck asking questions on https://groups.google.com/forum/#!forum/cider-emacs

Would you like to do that or shall I?

antoinevg commented 8 years ago

I took a bit of a left-turn last night and have been seeing what happens when running the code under Boot - although that leads to its own set of interesting issues with Java-Interop that I'm still working my way through %-}

I reckon it's probably better if you ask because you've probably got a clearer picture of all the moving parts than me!

rogerallen commented 8 years ago

Okay, I posted a question to https://groups.google.com/forum/#!topic/cider-emacs/tKqe7vGCda8

We'll see where this goes...

antoinevg commented 8 years ago

Okay, that took a fair amount of time spent in Boot digging through cider-nrepl & then tools.nrepl but I finally tracked down the root of our troubles to: http://bugs.java.com/view_bug.do?bug_id=8019496

Expanding on Petr's correct, but somewhat terse, evaluation. AppContext is a part of AWT, so anything that touches AppContext triggers loading of the AWT libraries. On OS X, specifying -XstartOnFirstThread, has an effect that the AWT loader waits (on a conditional variable) until someone (supposedly SWT) starts the run loop. This is the hung the bug is concerned about. So, in order to fix this issue, jmx and IIORegistry should stop using AppContext. As a workaround, if one have to use - XstartOnFirstThread on OS X fror purposes other than SWT embedding, specify -Djava.awt.headless=true option. This will prevent the AWT loader from waiting for the run loop to start.

Some things never change with the JVM… ;-)

See #7 for diffs :-)

rogerallen commented 8 years ago

Awesome! Thanks for your hard work and perseverance! I will take your pull request when I get back to my Mac later in the week.

rogerallen commented 8 years ago

I've merged your fix and the bug is basically resolved. One nit is that I see an odd message a few seconds after startup. Any thoughts on the last line below?

> lein run alpha cider
Hello, Lightweight Java Game Library! V 3.0.0b SNAPSHOT
Run example Alpha
Starting Cider Nrepl Server Port 7888
OpenGL version: 2.1 NVIDIA-10.4.2 310.41.35f01
2016-04-09 09:35:34.036 java[6422:826195] [JRSAppKitAWT markAppIsDaemon]: Process manager already initialized: can't fully enable headless mode.
antoinevg commented 8 years ago

From what I can see on the interwebs it's quite a common but - in our case - harmless (if annoying) side-effect. It would only be an issue if we were, in fact, trying to run in headless mode rather than using it as a cheap hack to bypass normal AWT initialisation.

i.e. java.awt.headless=true stops the AWT loader from waiting, but - later on - the JDK also checks to see if the AWT loader has been initialised (because it shouldn't be active at all if the JDK is truly running in headless mode) and then it pops up the warning.

Tx Roger!

rogerallen commented 8 years ago

Okay, looks like we are good to close. Thank you again for your help.

santiagocabrera96 commented 11 months ago

Damn, THANK YOU for checking this out. I've been looking all day on how to solve this problem trying to use Imgui with Clojure, and the frustration doing this have been high. I just stumbled upon this by looking everywhere for people living with this issue