SpencerPark / IJava

A Jupyter kernel for executing Java code.
MIT License
1.1k stars 217 forks source link

Best way to bootstrap custom render functions #61

Open andrus opened 5 years ago

andrus commented 5 years ago

Thanks for the iJava kernel. I started using it with pandas-inspired DFLib. Everything works great, and now I can do data science in pure Java ! :)

A question if I may.. Per this task I am trying to figure out the least intrusive way to add custom RenderFunctions to iJava to output DataFrames and Series. Using a JShell script with IJAVA_STARTUP_SCRIPTS_PATH requires manually setting classpath to add DFLib dependencies when starting jupyter. So I decided against it. IJAVA_STARTUP_SCRIPT doesn't recognize %maven magic. So also can't use it.

The best solution that I found is to treat DFLib integration as yet another custom library and simply paste it to every notebook:

%maven com.nhl.dflib:dflib-jupyter-ijava:0.6-SNAPSHOT
DFLibJupyter.bootstrap(getKernelInstance());

But I am wondering if there's a better way to install such extensions in iJava. Would be ideal to have a folder where I can drop the jars, so that they automatically appear on classpath of both IJAVA_STARTUP_SCRIPTS_PATH and the notebook.

Thoughts?

SpencerPark commented 5 years ago

Please, ask away! Very excited to hear about your success using the kernel for data science.

I consider the fact that the startup scripts can't use the magics, a bug and will open an issue for that accordingly. Still I think your approach is the better one. We were just talking about dependencies in another issue and I was saying how I prefer dependencies explicitly listed in the notebook; it makes them more easily shareable (and reproducible).

This question came at a great time because I'm in the middle of a refactor on the restructure branch of the base libraries and was contemplating a better plugin/extension api (specifically https://github.com/SpencerPark/jupyter-jvm-basekernel/issues/18). Part of this includes extracting as much of the kernel api out into it's own package for libraries just like dflib-jupyter-ijava to trim down the dependency.

Currently, what you have suggested is what I would also recommend. The dependency is reachable from anywhere and another user doesn't need to set up their system in any special way.

  %maven com.nhl.dflib:dflib-jupyter-ijava:0.6-SNAPSHOT
  DFLibJupyter.bootstrap(getKernelInstance());

The IJAVA_CLASSPATH variable is a decent option for your own setup if you want to add somewhere on your system for all jars. Set it to /path/to/the/jars/*.jar or similar to always have those jars available when you start a kernel.

As for the best way, I would like to also ask for your opinions. My current experience with plugins is usually a special meta file in the jar specifying a main class (that would implement JupyterPlugin) and other potential options. Like in gradle, this would be named with the plugin coordinates, or could be named something special like jupyter-plugin.json or .properties. Loading could then be a special magic that adds a jar to the classpath and calls the plugin's lifecycle method(s) with the kernel instance. Open to suggestions here, but in your case something like %loadExtension com.nhl.dflib:dflib-jupyter-ijava:0.6-SNAPSHOT would handle everything.

andrus commented 5 years ago

Thanks for the quick reply and for confirming my solution.

Regarding the plugin design. Yeah, %loadExtension com.nhl.dflib:dflib-jupyter-ijava:0.6-SNAPSHOT, with a lifecycle method called on startup would be ideal. As for how to locate the "root" of the plugin, I would recommend vanilla Java ServiceLoader:

  1. Say you have io.github.spencerpark.ijava.JupyterPlugin class in IJava. Then plugin authors would package a service descriptor file in their jars as META-INF/services/io.github.spencerpark.ijava.JupyterPlugin. Contents of this file will be the class name of the plugin.

  2. When iJava finishes extension classpath setup, it can load all plugins and call their lifecycle method:

for(JupyterPlugin p : ServiceLoader.load(JupyterPlugin.class)) {
    p.initialize(...);
}

This mechanism is reliable, ubiquitous and well-understood. We've been using it successfully in Bootique project for years to enable module-autoloading. FWIW it is even compatible with the new Java module system and module-info.java.

SpencerPark commented 5 years ago

Ah yes, ServiceLoader looks like the perfect fit! Thanks for the recommendation, great to know you've had good experiences with it and very happy to see a solution in the standard library.

I would like to have this in the api package of the jupyter base libs since a plugin like the one you are describing can be reused to work with any other kernels that end up being written on top of the base. IJava is just one of them, for example a groovy or scala kernel could also benefit from these renderers your plugin would provide and nothing should need to be rewritten.

Slightly unrelated, but as I'm enjoying the ServiceLoader idea, part of the upstream refactoring could provide much more things as services with this mechanism. Kernels being one. Still trying to target java 8 up there so not sure how deep into modules we'll get but knowing they are "future proof" makes this a much more worth-while investment.

Lets leave this open until things work their way back downstream and also in case I have more questions :) Thanks for your expertise!

andrus commented 5 years ago

I would like to have this in the api package of the jupyter base libs since a plugin like the one you are describing can be reused to work with any other kernels that end up being written on top of the base. IJava is just one of them, for example a groovy or scala kernel could also benefit from these renderers

Good idea.

part of the upstream refactoring could provide much more things as services with this mechanism. Kernels being one.

Keep in mind that ServiceLoader (intentionally) doesn't provide services dependency resolution (i.e. it is not a DI container). It seems perfect though for coarse-grained top-level things like Kernels and custom plugins.