jupyter-incubator / sparkmagic

Jupyter magics and kernels for working with remote Spark clusters
Other
1.33k stars 447 forks source link

Session Name with Session ID Suffix added dynamically? #625

Open smic-datalabs-von opened 4 years ago

smic-datalabs-von commented 4 years ago

My team connects to a Dockerized JupyterHub that spawns Jupyterlab containers with sparkmagic installed. Thing is, we are having problems knowing which user is using which session when viewing the Spark Master UI.

We thought that placing a modified config.json file with name key altered under session_configs key would take care of this. For example:

{
  "kernel_python_credentials" : {
    "username": "",
    "password": "",
    "url": "http://xxxx:8998",
    "auth": "None"
  },
  "session_configs": {
    "name": "jerry"
  }
}

However, we encountered another problem: If the same user opens up a new PySpark kernel, it would raise an error saying that there is already a session with same name, which is jerry in this case.

What we want is that each session opened up by the user with jerry config, would have a name like this: jerry-<session-id>.

So for example, each session opened by jerry should be jerry-1, jerry-2, and so on.

We use Spark Master UI as our main monitoring view for sessions opened in our Spark cluster.

Is my request possible?

itamarst commented 4 years ago

Can you use the mechanism described in https://github.com/jupyter-incubator/sparkmagic/#conf-overrides-in-code to override the config in your code?

smic-datalabs-von commented 4 years ago

Yes but that would defeat the purpose of automatic assignment of session names. Most of our users would often forget this. Moreover, having sparkmagic-specific preamble code for every notebook is not something we are keen on having.

juhoautio commented 4 years ago

Is this still relevant?

To keep backward compatibility, maybe it could rather be a new property, eg.

  "session_configs": {
    "name_prefix": "jerry"
  }
nsaintarnaud commented 3 years ago

still very relevant: we have the exact same issues - wanting to identify the user, yet needing to allow multiple concurrent kernels for hte same user

I like juhoautio's parameter suggestion

gthomas-slack commented 3 years ago

Wondering if anyone has found a solution to this issue?

smic-datalabs-von commented 3 years ago

Not sure if this project is still active or went private. I am looking for alternatives at this point. Any suggestions?