Open bernhard-42 opened 6 years ago
Merging #40 into master will decrease coverage by
25.82%
. The diff coverage is35.13%
.
@@ Coverage Diff @@
## master #40 +/- ##
===========================================
- Coverage 96.61% 70.78% -25.83%
===========================================
Files 3 4 +1
Lines 59 89 +30
Branches 5 10 +5
===========================================
+ Hits 57 63 +6
- Misses 2 26 +24
Impacted Files | Coverage Δ | |
---|---|---|
src/jupyter_spark/magic.py | 0% <0%> (ø) |
|
src/jupyter_spark/spark.py | 100% <100%> (ø) |
:arrow_up: |
src/jupyter_spark/handlers.py | 100% <100%> (ø) |
:arrow_up: |
src/jupyter_spark/__init__.py | 44.44% <25%> (-15.56%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 34ab4bf...38ed34a. Read the comment docs.
Thanks for the contribution! I hope to have a deeper look early next week.
I guess I have changed the code accordingly. I personally don't really like using internal APIs, however I understand your rationale. I marked it with a TODO.
A side note: If you work on an Hadoop cluster (as I do, hence the yarn stuff last time) shooting against uiWebUrl means shooting twice a second against the Resource Manager. If many users do this at the same time, this might create quite some traffic. Maybe a less chatty approach would be to use sc.statusTracker
in a background thread in the notebook triggered by Jupyter cell hooks and communicating the status to the notebook javascript via the Jupyter comm layer - just an idea ...
Thanks. I'm sorry -- I think I wasn't clear earlier. If you grab the spark context from the singleton, then the magic is completely optional in the common case. You would only need to use the magic if you explicitly want to set the url. Would you mind updating this so the magic is optional (and users can continue working as they have been unless this additional complexity is needed for them...?)
@bernhard-42 : Hope I didn't scare you off by creating confusion. Your contribution is very much appreciated.
No worries, first I didn't have time and then I forgot it ... Hope it now meets your expectations. If not, please feel free to accept and adapt as you need - this might actually be the faster process. I am happy either way.
@mdboom Any news regarding this? Or any other alternative solution for working with this extension on multiple tabs (each with a different Spark context and kernel)?
Is there any update on these changes getting pulled into the main project, or updates otherwise? This functionality would be very, very useful and the lack of it is a major block to using this extension.
Proposal for Issue 22:
In the Jupyter notebook a Jupyter Comm target gets opened to listen for messages from a python kernel. A new Jupyter Magic uses this comm target to forward the Spark API URL to the notebook:
%spark_progress spark
where
spark
is the variable holding the Spark Session, so the magic can useglobals()["spark"].sparkContext.uiWebUrl
to get the actual Spark API Url.Each call from the javascript notebook then forwards the Spark API Url as a query parameter
spark_url
to the backend handler which uses it to create the backend_url.This allows for multiple SparkContexts in different tabs and even for
spark.ui.port=0
setting.