Open QuantumRich opened 7 years ago
It seems my thread does not hang if I go off Wi-Fi. Perhaps the ports I am opening are being used? (e.g. Connecting to: tcp://127.0.0.1:50627) Is there a way of switching these?
I have been seeing this as well lately. It's intermittent. I started to notice it yet after upgraded the notebook server to 4.3.x. In our javascript client I am going to try and update the jupyterlab/services module. We are at 0.34.1, it looks like the latest is 0.35.1.
On Tue, Jan 17, 2017 at 3:25 PM, Richard Quan notifications@github.com wrote:
It seems my thread does not hang if I go off Wi-Fi. Perhaps the ports I am opening are being used? (e.g. Connecting to: tcp://127.0.0.1:50627) Is there a way of switching these?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/EclairJS/eclairjs/issues/20#issuecomment-273288993, or mute the thread https://github.com/notifications/unsubscribe-auth/AAiWKP5xz34NcFYEpaiAQ0OYkLVaoGzhks5rTSOzgaJpZM4LmCb6 .
Oh I'm glad I'm not alone in it. I get it more often when I am running eclair behind a webapp than if I were running eclair by itself as a script.
We can remove the code that is causing this as we never use the kernel info, but its somewhat disconcerting that its happening.
@bpburns we can take the commit that removes the code as well as the new jupyter security stuff fix and maybe backport em to the 0.10 branch and release a 0.10.1 to npm, unless there is anything else we want to wait for
Thank you very much for the updates. It was extra strange to me because all of my components were local and I couldn't use the Internet while running this.
UPDATE: It seems that I still have the same (more often than intermittent) problems even with the dev version of the repo. It may be my company's proxy then, which doesn't make sense given this project should not be connecting to an external source. Perhaps my websockets are blocked, but it doesn't make sense that it is intermittent.
The timeout error seems to be on the Jupyter side, I see it running regular Scala notebooks as well. Other than the message, are things working fine?
@doronrosenberg Yes it does seem like it's on the Jupyter side. It's not the message itself so much as the fact that the thread hangs after that. I get the message even if it works. But if it doesn't, all operations freeze and the Spark session is never initializes. Here is what I mean, shown below is the output from debug mode when it stops on me:
[I 19:18:30.700 NotebookApp] 0 active kernels
[I 19:18:30.700 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/
[I 19:18:30.700 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 19:18:34.573 NotebookApp] Sessions API changed, see updated swagger docs
[D 19:18:34.589 NotebookApp] Found kernel eclair in /Users/quantumrich/Library/Jupyter/kernels
[D 19:18:34.590 NotebookApp] Found kernel apache_toree_scala in /usr/local/share/jupyter/kernels
[D 19:18:34.591 NotebookApp] Native kernel (python2) available from /usr/local/lib/python2.7/site-packages/ipykernel/resources
[D 19:18:34.593 NotebookApp] Starting kernel: [u'/Users/quantumrich/Desktop/incubator-toree/dist/toree/bin/run.sh', u'--interpreter-plugin', u'eclair:org.eclairjs.nashorn.JavascriptInterpreter', u'--default-interpreter', u'eclair', u'--nosparkcontext', u'--profile', u'/Users/quantumrich/Library/Jupyter/runtime/kernel-d53113fa-f6c1-41f1-81cd-27264ed42655.json']
[D 19:18:34.598 NotebookApp] Connecting to: tcp://127.0.0.1:49606
[I 19:18:34.598 NotebookApp] Kernel started: d53113fa-f6c1-41f1-81cd-27264ed42655
[D 19:18:34.598 NotebookApp] Kernel args: {'kernel_name': u'eclair', 'cwd': u'/Users/quantumrich'}
[D 19:18:34.599 NotebookApp] 201 POST /api/sessions?1484785114561 (127.0.0.1) 27.14ms
Starting Spark Kernel with SPARK_HOME=/usr/local/spark
[D 19:18:34.610 NotebookApp] 200 GET /api/kernels/d53113fa-f6c1-41f1-81cd-27264ed42655?1484785114605 (127.0.0.1) 2.13ms
[D 19:18:34.619 NotebookApp] Initializing websocket connection /api/kernels/d53113fa-f6c1-41f1-81cd-27264ed42655/channels
[D 19:18:34.620 NotebookApp] Requesting kernel info from d53113fa-f6c1-41f1-81cd-27264ed42655
[D 19:18:34.621 NotebookApp] Connecting to: tcp://127.0.0.1:49603
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
17/01/18 19:18:35 [INFO] o.a.t.Main$$anon$1 - Kernel version: 0.2.0.dev1-incubating-SNAPSHOT
17/01/18 19:18:35 [INFO] o.a.t.Main$$anon$1 - Scala version: Some(2.11.8)
17/01/18 19:18:35 [INFO] o.a.t.Main$$anon$1 - ZeroMQ (JeroMQ) version: 3.2.5
17/01/18 19:18:35 [INFO] o.a.t.Main$$anon$1 - Initializing internal actor system
17/01/18 19:18:36 [INFO] o.a.t.Main$$anon$1 - Connection Profile: {
"stdin_port" : 49605,
"control_port" : 49606,
"hb_port" : 49607,
"shell_port" : 49603,
"iopub_port" : 49604,
"ip" : "127.0.0.1",
"transport" : "tcp",
"signature_scheme" : "hmac-sha256",
"key" : "dcb58b14-1f04-41a4-a925-bd5fd1e7382c"
}
(SparkR,org.apache.toree.kernel.interpreter.sparkr.SparkRInterpreter@3b705be7)
(Scala,org.apache.toree.kernel.interpreter.scala.ScalaInterpreter@3a790e40)
(PySpark,org.apache.toree.kernel.interpreter.pyspark.PySparkInterpreter@43acd79e)
(eclair,org.eclairjs.nashorn.JavascriptInterpreter@1e1b061)
(SQL,org.apache.toree.kernel.interpreter.sql.SqlInterpreter@38f77cd9)
17/01/18 19:18:37 [WARN] o.a.t.Main$$anon$1 - No external magics provided to PluginManager!
17/01/18 19:18:39 [INFO] o.a.t.Main$$anon$1 - 13 internal plugins loaded
17/01/18 19:18:39 [INFO] o.a.t.Main$$anon$1 - 0 external plugins loaded
17/01/18 19:18:42 [WARN] o.a.t.k.i.s.ScalaInterpreter - kernel variable: org.apache.toree.boot.layer.StandardComponentInitialization$$anon$1@5777e443
17/01/18 19:18:42 [WARN] o.a.t.k.i.s.ScalaInterpreter - Binding List(@transient implicit) kernel org.apache.toree.kernel.api.Kernel org.apache.toree.boot.layer.StandardComponentInitialization$$anon$1@5777e443
17/01/18 19:18:43 [INFO] o.a.t.k.i.s.ScalaInterpreter - Binding SparkContext into interpreter as sc
Start kernel init
END kernel init
17/01/18 19:18:43 [INFO] o.a.t.Main$$anon$1 - Marking relay as ready for receiving messages
17/01/18 19:18:43 [INFO] o.a.t.k.p.v.r.KernelMessageRelay - Unstashing all messages received!
17/01/18 19:18:43 [INFO] o.a.t.k.p.v.r.KernelMessageRelay - Relay is now fully ready to receive messages!
[W 19:18:44.626 NotebookApp] Timeout waiting for kernel_info reply from d53113fa-f6c1-41f1-81cd-27264ed42655
[D 19:18:44.626 NotebookApp] Opening websocket /api/kernels/d53113fa-f6c1-41f1-81cd-27264ed42655/channels
[D 19:18:44.627 NotebookApp] Connecting to: tcp://127.0.0.1:49603
[D 19:18:44.627 NotebookApp] Connecting to: tcp://127.0.0.1:49604
[D 19:18:44.627 NotebookApp] Connecting to: tcp://127.0.0.1:49605
Meanwhile shown below is what it looks like when it does connect on rare occasions. It seems that if I get the 'Relay is ready' message BEFORE I connect to the 3 tcp ports it freezes.
[I 19:21:30.520 NotebookApp] 0 active kernels
[I 19:21:30.520 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/
[I 19:21:30.520 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 19:21:37.158 NotebookApp] Sessions API changed, see updated swagger docs
[D 19:21:37.178 NotebookApp] Found kernel eclair in /Users/quantumrich/Library/Jupyter/kernels
[D 19:21:37.179 NotebookApp] Found kernel apache_toree_scala in /usr/local/share/jupyter/kernels
[D 19:21:37.179 NotebookApp] Native kernel (python2) available from /usr/local/lib/python2.7/site-packages/ipykernel/resources
[D 19:21:37.180 NotebookApp] Starting kernel: [u'/Users/quantumrich/Desktop/incubator-toree/dist/toree/bin/run.sh', u'--interpreter-plugin', u'eclair:org.eclairjs.nashorn.JavascriptInterpreter', u'--default-interpreter', u'eclair', u'--nosparkcontext', u'--profile', u'/Users/quantumrich/Library/Jupyter/runtime/kernel-be95f4d2-3cd4-467c-858a-bf2c41c5add1.json']
[D 19:21:37.185 NotebookApp] Connecting to: tcp://127.0.0.1:49804
[I 19:21:37.186 NotebookApp] Kernel started: be95f4d2-3cd4-467c-858a-bf2c41c5add1
[D 19:21:37.186 NotebookApp] Kernel args: {'kernel_name': u'eclair', 'cwd': u'/Users/quantumrich'}
[D 19:21:37.187 NotebookApp] 201 POST /api/sessions?1484785297144 (127.0.0.1) 30.34ms
Starting Spark Kernel with SPARK_HOME=/usr/local/spark
[D 19:21:37.200 NotebookApp] 200 GET /api/kernels/be95f4d2-3cd4-467c-858a-bf2c41c5add1?1484785297192 (127.0.0.1) 1.41ms
[D 19:21:37.209 NotebookApp] Initializing websocket connection /api/kernels/be95f4d2-3cd4-467c-858a-bf2c41c5add1/channels
[D 19:21:37.212 NotebookApp] Requesting kernel info from be95f4d2-3cd4-467c-858a-bf2c41c5add1
[D 19:21:37.212 NotebookApp] Connecting to: tcp://127.0.0.1:49801
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
17/01/18 19:21:38 [INFO] o.a.t.Main$$anon$1 - Kernel version: 0.2.0.dev1-incubating-SNAPSHOT
17/01/18 19:21:38 [INFO] o.a.t.Main$$anon$1 - Scala version: Some(2.11.8)
17/01/18 19:21:38 [INFO] o.a.t.Main$$anon$1 - ZeroMQ (JeroMQ) version: 3.2.5
17/01/18 19:21:38 [INFO] o.a.t.Main$$anon$1 - Initializing internal actor system
17/01/18 19:21:39 [INFO] o.a.t.Main$$anon$1 - Connection Profile: {
"stdin_port" : 49803,
"control_port" : 49804,
"hb_port" : 49805,
"shell_port" : 49801,
"iopub_port" : 49802,
"ip" : "127.0.0.1",
"transport" : "tcp",
"signature_scheme" : "hmac-sha256",
"key" : "a8a1a4a6-dc53-4680-8cf0-eed9b66e23cd"
}
17/01/18 19:21:39 [INFO] o.a.t.k.p.v.r.KernelMessageRelay - Not ready for messages! Stashing until ready!
(SparkR,org.apache.toree.kernel.interpreter.sparkr.SparkRInterpreter@63a28987)
(Scala,org.apache.toree.kernel.interpreter.scala.ScalaInterpreter@120d3fd)
(PySpark,org.apache.toree.kernel.interpreter.pyspark.PySparkInterpreter@38c9e0d6)
(eclair,org.eclairjs.nashorn.JavascriptInterpreter@4693a9ef)
(SQL,org.apache.toree.kernel.interpreter.sql.SqlInterpreter@310aee0b)
17/01/18 19:21:41 [WARN] o.a.t.Main$$anon$1 - No external magics provided to PluginManager!
17/01/18 19:21:42 [INFO] o.a.t.Main$$anon$1 - 13 internal plugins loaded
17/01/18 19:21:42 [INFO] o.a.t.Main$$anon$1 - 0 external plugins loaded
17/01/18 19:21:46 [WARN] o.a.t.k.i.s.ScalaInterpreter - kernel variable: org.apache.toree.boot.layer.StandardComponentInitialization$$anon$1@188467fe
17/01/18 19:21:46 [WARN] o.a.t.k.i.s.ScalaInterpreter - Binding List(@transient implicit) kernel org.apache.toree.kernel.api.Kernel org.apache.toree.boot.layer.StandardComponentInitialization$$anon$1@188467fe
17/01/18 19:21:47 [INFO] o.a.t.k.i.s.ScalaInterpreter - Binding SparkContext into interpreter as sc
[W 19:21:47.217 NotebookApp] Timeout waiting for kernel_info reply from be95f4d2-3cd4-467c-858a-bf2c41c5add1
[D 19:21:47.218 NotebookApp] Opening websocket /api/kernels/be95f4d2-3cd4-467c-858a-bf2c41c5add1/channels
[D 19:21:47.218 NotebookApp] Connecting to: tcp://127.0.0.1:49801
[D 19:21:47.219 NotebookApp] Connecting to: tcp://127.0.0.1:49802
[D 19:21:47.219 NotebookApp] Connecting to: tcp://127.0.0.1:49803
17/01/18 19:21:47 [INFO] o.a.t.k.p.v.r.KernelMessageRelay - Not ready for messages! Stashing until ready!
Start kernel init
END kernel init
17/01/18 19:21:47 [INFO] o.a.t.Main$$anon$1 - Marking relay as ready for receiving messages
17/01/18 19:21:47 [INFO] o.a.t.k.p.v.r.KernelMessageRelay - Unstashing all messages received!
17/01/18 19:21:47 [INFO] o.a.t.k.p.v.r.KernelMessageRelay - Relay is now fully ready to receive messages!
EXEC=var SparkSession = require(EclairJS_Globals.NAMESPACE + '/sql/SparkSession');
var builder1 = SparkSession.builder();
RESULT=null
17/01/18 19:21:47 [WARN] o.a.t.k.p.v.s.KernelOutputStream - Suppressing empty output: ''
EXEC=var builder2 = builder1.appName("Spark SQL Example");
RESULT=null
17/01/18 19:21:47 [WARN] o.a.t.k.p.v.s.KernelOutputStream - Suppressing empty output: ''
EXEC=var sparkSession1 = builder2.getOrCreate();
17/01/18 19:21:47 [WARN] o.a.h.u.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/01/18 19:21:49 [WARN] o.a.s.SparkContext - Use an existing SparkContext, some configuration may not take effect.
RESULT=null
17/01/18 19:21:49 [WARN] o.a.t.k.p.v.s.KernelOutputStream - Suppressing empty output: ''
It is very annoying for me to develop in an enterprise setting since I've relegated to turning off my computer's WiFi before I connect the websocket, and then doing all my spark operations after I get onto the network and the websocket has already connected. That is my workaround.
@bpburns knows jupyter and toree the best, any idea if oyu can tell it what ports to use? Might be the reason it sometimes works.
@doronrosenberg @QuantumRich you can pass those into Toree's run.sh script. Take a look at run.sh.
Having read through this https://github.com/jupyter/notebook/issues/297 . It could be certain libraries conflicting. My guess would be zero mq. For my notebook environment, I'm running Anaconda https://www.continuum.io/anaconda-overview. When I upgraded my notebook environment it seemed to help. It upgraded several libraries. I ran conda upgrade notebook
. I've since successfully started Toree 6 times in a row so far.
UPDATE:
It turns out this is the way to set the ports manually:
export TOREE_OPTS="--shell-port=61642 --iopub-port=61643"
Also, the connection hanging problem seems to be based on these two ports specifically while I am on the proxy. I am still trying to figure out how to change these two ports to allow the socket connection past the corporate firewall.
I seem to be getting this issue from Jupyter when connecting now. It was not a problem on Friday, but now I consistently I get this error. Perhaps it may have to do with 'Sessions API changed, see updated swagger docs'?