Session Management - Githubissues

Brian-Burns-Bose commented 8 years ago

EclairJS node instances need the ability to control how many kernels are created, which kernel they connect to, and be able to determine if api calls were already executed in a kernel.

Scenarios

If x number of instances of a node application are created they may want to all connect to the same kernel or connect to separate kernels. If the instances are configured to connect to the same kernel, they may have to know whether any startup code has already been executed by that kernel. For example if a node app starts a spark streaming context, that streaming context can only be started once.
If a kernel dies, the node app needs to be able to launch a new kernel and re-execute it's spark startup code.
Questions

How do we determine if api calls have already been executed in a kernel? We could check to see if a SparkContext has been created?

How do we determine that a kernel id is associated with a particular node app? kernel id's are generated by the kernel gateway. We don't have control over them?

doronrosenberg commented 8 years ago

There are some other questions as well - should a require("eclairjs") object be bound to only one kernel? Currently it is, and we create the kernel connection as the user loads in the eclairjs module.

I would say no - there should be a formal api (require("eclairjs").create() for example) that creates a kernel connection and the user should be able to retrieve the kernel id and use that during a create() call to connect to an existing context. Note that this would break stuff (for example our variable name creation code has no idea of existing variables and would start from 0 again, causing conflicts).

doronrosenberg commented 8 years ago

Elaborating on the above discussion:

What a node application that uses EclairJS needs:

After running an EclairJS program, it needs to retrieve an identifier for the running program. This should probably simply be the Kernel uuid.
Get notified of important status changes (Dead, etc). Kernel provides a status/statusChanged methods.
Ability to stop or restart an running EclairJS program. Kernel provides restart/shutdown methods.
Ability to reconnect to a running program (for example the node server crashed). Kernel provides connectToKernel.
- This creates the possibility of reconnecting and then executing code. The problem with this is that variable name generation happens in eclairjs-node, and we would have no idea of existing variable names defined in the running -nashorn program. This could lead to variables being redefined, etc. - might require us to return to using uuids in variable names.

In regards to session management, the best solutions seems to be:

Node.js developer stores the Kernel uuid somewhere (memchache, etc). That allows reconnecting should Node.js crash/restart.
Node.js developer should only ever have one "connection" to a Kernel. If multiple apps need access to an Kernel's data, store the data somewhere outside Spark/Kernel.

This would require Eclairjs-node to change - require("eclairjs") would now return a object with several methods on it (create() and connect/stop/restart/status(uuid)) and then only during create/connect would it return an instance of the Spark api, bound to a specific kernel.

Brian-Burns-Bose commented 8 years ago

A couple of thoughts on this with the api. We are really talking about Kernel management and SparkContext management. We need to define the SparkContext to Kernel relationship then we can make some decisions there after. Talking about the Kernel to NodeJS developer will get confusing. In our case SparkContext really means a remote kernel with a SparkContext reference. Let's focus this discussion on SparkContext rather than Kernel. SparkContext can take an app name as a parameter. Let's use this as the identifier.

NodeJS application creates a new SparkContext with a given application name. One SparkContext per Kernel.
- Does this spark context already exist?
- Yes. Find out what kernel it is associated with, connect to it and return the reference to the existing SparkContext.
- No. Launch a new kernel, create a new SparkContext and return the reference to it.
- Hide all the kernel uuid semantics from the developer and let them focus on SparkContext.
Developers can control app instance and SparkContext behavior with application name.
- Multiple instances connect to the same kernel.
- the developer just needs to use a hardcoded app name and all instances of the application will share the same SparkContext / Kernel.
- Each instance gets it's own SparkContext / Kernel. The NodeJS application generates it's own app name.

We don't have control over kernel id creation so how do we map kernel id to SparkContext application name?

We could implement an api in eclairjs-nashorn that returns the name of the current SparkContext. Have to enumerate through the running kernels and call each one for their app name. This might work for now, but we should find a better way.
Other ideas? Maybe we could get a better api from the kernel gateway or implement one there.

Brian-Burns-Bose commented 8 years ago

Some information regarding the notebook server's session api. The session api is meant to be used for session management of a particular notebook. Where notebook is associated with a "path" / file name. If you send a POST to /api/sessions with a notebook path of "foo" and a kernel spec name, a kernel will be created for this session. We could use notebook path as our SparkContext app name. Essentially what we are doing here is treating SparkContext as a notebook session. The issue is the kernel gateway doesn't implement /api/sessions, but we could fix that.

doronrosenberg commented 8 years ago

One possible improvement from an API standpoint would be to allow passing in our own identifier during Kernel creation that we could query for later (in connectToKernel calls for example). This would eliminate the need for Toree users to store the kernelid somewhere or have to query the running Kernels. In our case that could now easily be the SparkContext name.

This would have the additional benefit of allowing one to figure out what program a Kernel is executing without having to look into the Kernel's executed code.

doronrosenberg commented 8 years ago

One thing I am seeing with the sessions api is that it always creates a python kernel for me.

EclairJS / eclairjs-node

Session Management #14

Scenarios

Questions