Closed SiAdcock closed 7 years ago
I have an idea and I will see if I can reproduce this. What is the nature of the script code that is running when you try to connect? Specifically, do you have a debugger
statement that is hit often (like in a loop)?
When you moved the debugger
statement, did you restart the target application or do you hot-reload the scripts?
I think the location of the debugger
statement may have been a red herring. I added it to the top of a function that generates a HTML string. It should only be hit once. But even removing the statement causes the application to time out.
We have a watch task that recompiles the JavaScript bundle if any JS files are changed, so previously I was not restarting the application every time I made a change.
I will continue investigating, perhaps I'll try restarting the application every time I make a change.
I haven't succeeded in reproducing the issue, but I will push a change soon that improves logging a bit, which hopefully will give more clues.
Ok, pushed now. Could you please try again?
Thanks for that. I'm able to connect Chrome dev tools to the JS running in Nashorn, which is a step forward! I am now trying to formulate a workflow that allows me to do this consistently.
Part of the problem was that I was not always waiting long enough for the debugger to connect to the JavaScript. I guess I need to be more patient, but now I realise that I have to wait for the Greetings from ncdbg
message in the Chrome console. This occasionally takes a long time, at least on my machine 😅
The second part of the problem involves our watch task that recompiles the JavaScript. It seems that when the bundle is recompiled, something breaks in the debugger, and subsequent page refreshes time out with the message:
[RenderingException: Ask timed out on [Actor[akka://application/user/$a#143668542]] after [30000 ms]. Sender[null] sent message of type "rendering.core.Rendering".]
I will continue experimenting to find a workflow that works for me, but it would be useful to prove using a minimal example that recompiling JavaScript is causing an issue somehow.
Hm, that console message won't appear until after you have connected devtools to ncdbg, but the error happens when ncdbg is wiring up the devtools connection. I think I'm missing something here - could you please describe more in detail what you did when you saw the error?
When you write "something breaks in the debugger", do you mean ncdbg? The rendering error isn't something I recognize.
Sorry for the confusion here. The problems I have been experiencing are as a result of a number of different moving parts.
I have found a workflow that works, so I think I know where things are going wrong. Your instinct about the watch task that recompiles JavaScript appears to be correct.
My workflow:
debugger
statement to JavaScriptncdbg
ncdbg
connects, open the Chrome Devtools linkGreetings from ncdbg!
message appears, hit application page I want to debugdebugger
statement)RenderingException
error. I'm not sure why the app would time out here. If I terminate ncdbg
and reload the page, the application runs correctly, so I suspect ncdbg
is somehow causing the application to slow down while it is connected.
If I then reconnect ncdbg
and reload the page, it hits the debugger
statement at the new location.
I guess the substantive issue here is that changes to JavaScript in our application require a restart of ncdbg
. Is this known or expected behaviour?
Apologies for the slow progress on this issue!
How and when is the recompiled JavaScript loaded into Nashorn?
Also, have you seen the exception again?
Yes, sometimes I see the ActorInitializationException
when attempting to open the Chrome devtools link. It seems to be happening intermittently, so if I restart the application and reconnect ncdbg, it does not appear the next time.
The JavaScript is recompiled immediately when a JavaScript file is updated. It is loaded into Nashorn whenever a page is rendered (e.g. here which calls an abstraction around Nashorn's ScriptEngine.eval
here).
I recognise that the ActorInitializationException
issue and the timeout-on-recompilation issue are separate and probably unrelated, so I'd be happy to raise the latter as a separate issue.
Ok, the core problem is that the NashornDebuggerHost
class (NDH) is single-threaded, and there are cases where it may deadlock (such as loading a script that contains a debugger
statement when paused on a breakpoint - which is an issue I'll consider separately). If NDH is deadlocked then all operations will time out, so one possible explanation for the render timeout is that the script thread is paused on a breakpoint but NDH isn't available to handle it.
I assumed you loaded the scripts using load
which potentially could cause URL problems (different script contents, same URL), but I see that you eval
the scripts... so the URL thing shouldn't be a problem.
Before I refactor NDH to be multi-threaded, I'd like to know the cause of the deadlock. One idea I have is to create a separate branch with a bunch of println
statements that could help us narrow down the problem. Would that work for you?
Thanks very much, I'll be happy to help. Let me know what branch to use and I'll give it a test.
Could you please try the track-deadlock branch? There should be output like this:
Timed out waiting for 'public abstract ...
Thanks!
Here is the output I received while attempting to connect to Chrome devtools:
[12:24:00] simon_adcock:ncdbg git:(track-deadlock) $ ./gradlew run -Pargs="--connect 1044"
Starting a Gradle Daemon (subsequent builds will be faster)
> Task :run
Local Java version: 1.8.0_91
Connecting to localhost:1044...
Connected!
Remote VM information:
- Java Debug Interface (Reference Implementation) version 1.8
- Java Debug Wire Protocol (Reference Implementation) version 1.8
- JVM Debug Interface version 1.2
- JVM version 1.8.0_91 (Java HotSpot(TM) 64-Bit Server VM, mixed mode, sharing)
Server is listening on localhost:7778
Open this URL in Chrome: chrome-devtools://devtools/bundled/inspector.html?experiments=true&v8only=true&ws=localhost:7778/dbg
Receiving WebSocket clients at localhost/127.0.0.1:7778
Enabling breaking on exceptions in script classes.
Enabling automatic breaking at JavaScript 'debugger' statements
Host initialization is complete.
Adding script with URI 'eval:///Script1474'
Adding script with URI 'eval:///Script600'
Adding script with URI 'eval:///Script821'
Adding script with URI 'eval:///ScriptRecompilation1124A'
New WebSocket client from /127.0.0.1:58472 at endpoint '/dbg'.
A Developer Tools client connected!
Failed to handle message: Cannot create domain method Debugger.setAsyncCallStackDepth because the domain and/or method are unknown
Failed to handle message: Cannot create domain method Debugger.setBlackboxPatterns because the domain and/or method are unknown
Timed out waiting for 'public abstract void com.programmaticallyspeaking.ncd.nashorn.NashornScriptHost.handleOperation(com.programmaticallyspeaking.ncd.nashorn.NashornScriptOperation)(PostponeInitialize)[4]' to complete. Calls at entry: ''
Timed out waiting for 'public abstract com.programmaticallyspeaking.ncd.messaging.Observable com.programmaticallyspeaking.ncd.host.ScriptHost.events()()[5]' to complete. Calls at entry: 'public abstract void com.programmaticallyspeaking.ncd.nashorn.NashornScriptHost.handleOperation(com.programmaticallyspeaking.ncd.nashorn.NashornScriptOperation)(PostponeInitialize)[4]'
[ERROR] [09/12/2017 12:27:18.583] [default-akka.actor.default-dispatcher-2] [akka://default/user/Profiler-domact1] null
akka.actor.ActorInitializationException: akka://default/user/Profiler-domact1: exception during creation
at akka.actor.ActorInitializationException$.apply(Actor.scala:175)
at akka.actor.ActorCell.create(ActorCell.scala:607)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:461)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:483)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:282)
at akka.dispatch.Mailbox.run(Mailbox.scala:223)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy3.events(Unknown Source)
at com.programmaticallyspeaking.ncd.chrome.domains.DomainActor.initHost(DomainActor.scala:44)
at com.programmaticallyspeaking.ncd.chrome.domains.DomainActor.preStart(DomainActor.scala:35)
at akka.actor.Actor.aroundPreStart(Actor.scala:505)
at akka.actor.Actor.aroundPreStart$(Actor.scala:505)
at com.programmaticallyspeaking.ncd.chrome.domains.DomainActor.aroundPreStart(DomainActor.scala:22)
at akka.actor.ActorCell.create(ActorCell.scala:590)
... 9 more
Caused by: java.util.concurrent.TimeoutException: Timed out waiting for 'public abstract com.programmaticallyspeaking.ncd.messaging.Observable com.programmaticallyspeaking.ncd.host.ScriptHost.events()()[5]' to complete. Calls at entry: 'public abstract void com.programmaticallyspeaking.ncd.nashorn.NashornScriptHost.handleOperation(com.programmaticallyspeaking.ncd.nashorn.NashornScriptOperation)(PostponeInitialize)[4]'
at com.programmaticallyspeaking.ncd.infra.ExecutorProxy$Handler.invoke(ExecutorProxy.scala:65)
... 16 more
[INFO] [09/12/2017 12:27:18.588] [default-akka.actor.default-dispatcher-3] [akka://default/user/Profiler-domact1] Message [com.programmaticallyspeaking.ncd.chrome.domains.Messages$Request] from Actor[akka://default/user/$a#-2097252900] to Actor[akka://default/user/Profiler-domact1#711657428] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
Timed out waiting for 'public abstract com.programmaticallyspeaking.ncd.messaging.Observable com.programmaticallyspeaking.ncd.host.ScriptHost.events()()[6]' to complete. Calls at entry: 'public abstract void com.programmaticallyspeaking.ncd.nashorn.NashornScriptHost.handleOperation(com.programmaticallyspeaking.ncd.nashorn.NashornScriptOperation)(PostponeInitialize)[4]', 'public abstract com.programmaticallyspeaking.ncd.messaging.Observable com.programmaticallyspeaking.ncd.host.ScriptHost.events()()[5]'
[ERROR] [09/12/2017 12:27:18.603] [default-akka.actor.default-dispatcher-2] [akka://default/user/Runtime-domact2] null
akka.actor.ActorInitializationException: akka://default/user/Runtime-domact2: exception during creation
at akka.actor.ActorInitializationException$.apply(Actor.scala:175)
at akka.actor.ActorCell.create(ActorCell.scala:607)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:461)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:483)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:282)
at akka.dispatch.Mailbox.run(Mailbox.scala:223)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy3.events(Unknown Source)
at com.programmaticallyspeaking.ncd.chrome.domains.DomainActor.initHost(DomainActor.scala:44)
at com.programmaticallyspeaking.ncd.chrome.domains.DomainActor.preStart(DomainActor.scala:35)
at akka.actor.Actor.aroundPreStart(Actor.scala:505)
at akka.actor.Actor.aroundPreStart$(Actor.scala:505)
at com.programmaticallyspeaking.ncd.chrome.domains.DomainActor.aroundPreStart(DomainActor.scala:22)
at akka.actor.ActorCell.create(ActorCell.scala:590)
... 9 more
Caused by: java.util.concurrent.TimeoutException: Timed out waiting for 'public abstract com.programmaticallyspeaking.ncd.messaging.Observable com.programmaticallyspeaking.ncd.host.ScriptHost.events()()[6]' to complete. Calls at entry: 'public abstract void com.programmaticallyspeaking.ncd.nashorn.NashornScriptHost.handleOperation(com.programmaticallyspeaking.ncd.nashorn.NashornScriptOperation)(PostponeInitialize)[4]', 'public abstract com.programmaticallyspeaking.ncd.messaging.Observable com.programmaticallyspeaking.ncd.host.ScriptHost.events()()[5]'
at com.programmaticallyspeaking.ncd.infra.ExecutorProxy$Handler.invoke(ExecutorProxy.scala:65)
... 16 more
[INFO] [09/12/2017 12:27:18.603] [default-akka.actor.default-dispatcher-2] [akka://default/user/Runtime-domact2] Message [com.programmaticallyspeaking.ncd.chrome.domains.Messages$Request] from Actor[akka://default/user/$a#-2097252900] to Actor[akka://default/user/Runtime-domact2#361313966] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[INFO] [09/12/2017 12:27:18.603] [default-akka.actor.default-dispatcher-2] [akka://default/user/Runtime-domact2] Message [com.programmaticallyspeaking.ncd.chrome.domains.Messages$Request] from Actor[akka://default/user/$a#-2097252900] to Actor[akka://default/user/Runtime-domact2#361313966] was not delivered. [3] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
Timed out waiting for 'public abstract com.programmaticallyspeaking.ncd.messaging.Observable com.programmaticallyspeaking.ncd.host.ScriptHost.events()()[7]' to complete. Calls at entry: 'public abstract void com.programmaticallyspeaking.ncd.nashorn.NashornScriptHost.handleOperation(com.programmaticallyspeaking.ncd.nashorn.NashornScriptOperation)(PostponeInitialize)[4]', 'public abstract com.programmaticallyspeaking.ncd.messaging.Observable com.programmaticallyspeaking.ncd.host.ScriptHost.events()()[5]', 'public abstract com.programmaticallyspeaking.ncd.messaging.Observable com.programmaticallyspeaking.ncd.host.ScriptHost.events()()[6]'
[ERROR] [09/12/2017 12:27:18.605] [default-akka.actor.default-dispatcher-6] [akka://default/user/Debugger-domact3] null
akka.actor.ActorInitializationException: akka://default/user/Debugger-domact3: exception during creation
at akka.actor.ActorInitializationException$.apply(Actor.scala:175)
at akka.actor.ActorCell.create(ActorCell.scala:607)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:461)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:483)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:282)
at akka.dispatch.Mailbox.run(Mailbox.scala:223)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy3.events(Unknown Source)
at com.programmaticallyspeaking.ncd.chrome.domains.DomainActor.initHost(DomainActor.scala:44)
at com.programmaticallyspeaking.ncd.chrome.domains.DomainActor.preStart(DomainActor.scala:35)
at akka.actor.Actor.aroundPreStart(Actor.scala:505)
at akka.actor.Actor.aroundPreStart$(Actor.scala:505)
at com.programmaticallyspeaking.ncd.chrome.domains.DomainActor.aroundPreStart(DomainActor.scala:22)
at akka.actor.ActorCell.create(ActorCell.scala:590)
... 9 more
Caused by: java.util.concurrent.TimeoutException: Timed out waiting for 'public abstract com.programmaticallyspeaking.ncd.messaging.Observable com.programmaticallyspeaking.ncd.host.ScriptHost.events()()[7]' to complete. Calls at entry: 'public abstract void com.programmaticallyspeaking.ncd.nashorn.NashornScriptHost.handleOperation(com.programmaticallyspeaking.ncd.nashorn.NashornScriptOperation)(PostponeInitialize)[4]', 'public abstract com.programmaticallyspeaking.ncd.messaging.Observable com.programmaticallyspeaking.ncd.host.ScriptHost.events()()[5]', 'public abstract com.programmaticallyspeaking.ncd.messaging.Observable com.programmaticallyspeaking.ncd.host.ScriptHost.events()()[6]'
at com.programmaticallyspeaking.ncd.infra.ExecutorProxy$Handler.invoke(ExecutorProxy.scala:65)
... 16 more
Observed termination of the last domain actor, disconnecting client.
[INFO] [09/12/2017 12:27:18.608] [default-akka.actor.default-dispatcher-6] [akka://default/user/Debugger-domact3] Message [com.programmaticallyspeaking.ncd.chrome.domains.Messages$Request] from Actor[akka://default/user/$a#-2097252900] to Actor[akka://default/user/Debugger-domact3#-687609193] was not delivered. [4] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
[INFO] [09/12/2017 12:27:18.609] [default-akka.actor.default-dispatcher-6] [akka://default/user/Debugger-domact3] Message [com.programmaticallyspeaking.ncd.chrome.domains.Messages$Request] from Actor[akka://default/user/$a#-2097252900] to Actor[akka://default/user/Debugger-domact3#-687609193] was not delivered. [5] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
<==========---> 80% EXECUTING [1m 44s]
> :run
Thanks! That lead to a suspicion, but more info is needed... I pushed a new change that dumps threads as well. Could you please test it? I changed the printlns to debug logging, so could you please post/attach the complete logs/ncdbg.log file (it may be a good idea to remove the current one before you test)?
Btw, I appreciate you patience in helping me find the cause of this!
No problem, thanks for sparing us so much time!
Let me know if you need any more information
Huh, at a first glance it looks like it's caused by the sheer amount of scripts (i.e., generated script classes) present in the target VM. Let me think about it!
Okay 😅 I could perhaps try not generating source maps, if you think that might help?
No, source maps should be fine - they're external files not loaded into Nashorn.
I've pushed a potential fix to the track-deadlock branch. Basically the fix chops up class scanning into chunks of 2 seconds. Could you please give it a try?
Until you see "Class scanning complete!", you may not see all scripts in DevTools yet.
Note that I chose 2 seconds pretty arbitrarily. Perhaps 1 second is better. Perhaps it should be configurable. :)
Have you had a chance to test?
Hi @provegard, apologies for the delay. I have tested and I am now able to consistently connect to Chrome devtools without any problem. I no longer see the ActorInitializationException
at all, which is great!
Thanks for looking into it, let me know if I can help any further
Excellent! I'll merge the test branch to master and close this issue.
Finding and reporting issues is great help! :-)
Fix included in release 0.5.6
Btw, I'd be interested in hearing your experience with debugging using source maps. I've found that quite often source maps do a bad job of mapping between source script and JavaScript.
Thanks again for fixing
I don't want to say source maps do a bad job as it would do a disservice to the people who are working to tackle this complex problem. I imagine it's difficult for browser vendors, transpilers and bundlers to continue supporting more and more source languages and features, and more source mapping strategies that yield improved performance or accuracy.
That said, here are some common problems I have experienced:
debugger
statements is usually accurate.Source maps are definitely not a solved problem, but debugging with source maps is in a much better place today than it was a year ago, with Chrome devtools leading the way, closely followed by Firefox. Chrome Canary and Firefox Developer Edition provide much better accuracy than their consumer counterparts. I honestly could not imagine developing without them, despite their imperfections. For instance, error reports sent to Sentry are usually accurate to the line.
Please let me know if you need any more specific examples. Is it on your road map to support source maps?
Perhaps I was unfair - but my impression is that source map generators can be a bit quirky. I've often observed the first problem you describe. I have studied a few generated source maps (for CoffeeScript and Python/Transcrypt, using https://sokra.github.io/source-map-visualization/), and sometimes the mappings don't make sense to me. I assume that generating "perfect" source map isn't highest on the priority list of transpiler developers (which makes sense to me). :-)
(Admittedly, it could also be that DevTools is a bit quirky in its source map parsing.)
Yes, ncdbg supports source maps. There are some limitations, the primary one being that ncdbg has to have access to reading the map file and the original source file (for non-inline maps). Source map paths must be absolute - in fact this is why scripts with relative paths are ignored, but I'm thinking about lifting that restriction and accepting that in those cases source maps won't work.
Furthermore, generated Java bytecode doesn't contain column numbers, so when DevTools talks column numbers, ncdbg simply has to be as flexible as it can. There may certainly be bugs in this area.
Thanks for drawing my attention to that visualization, it's given me a clearer perspective on the state of source mapping. I'll certainly be spending time better understanding how source maps work, as it's not something I've thought much about.
Part of my work remit right now is improving developer experience. I will give source maps a try in ncdbg
next week and let you know how I get on 😄
Hi
I have tried connecting
ncdbg
to a large Play application. We are currently in the process of extracting the rendering layer into JavaScript, bundled by Webpack and running on Nashorn.The first time I ran the debugger, it actually ran okay, and when I opened Chrome dev tools, it paused on a
debugger
statement in the JavaScript. However, I subsequently moved thedebugger
statement, and now attempts to open dev tools connected to the debug port result in an error (copy-pasted at the bottom of this description) after about 30 seconds.The application itself also fails with the error:
I have tried removing
ncdbg
from my computer and re-cloning it, removing thedebugger
statement, moving it back to its first location when it worked successfully, but I now always get this error.Do you know what may be causing it?