dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.2k stars 1.41k forks source link

Configurable jupyter kernelspecs for dagstermill #845

Open mgasner opened 5 years ago

mgasner commented 5 years ago

This is sort of two issues.

  1. We need some way of giving users knobs to control the kernelspec when they scaffold a new notebook using dagstermill create-notebook, in particular, if I'm running python3 and I want to scaffold a new notebook to use my existing python2 kernel.

  2. It might be reasonable to let users configure kernelspecs on the fly when dagstermill actually executes a notebook. Kernelspecs could be exposed as a resource, or as solid-level config, and injected into the notebook at runtime.

mgasner commented 5 years ago

For 2), our test story would be simplified if we could test notebooks on py2 and py3 by just switching kernelspec config, see: https://github.com/dagster-io/dagster/blob/master/python_modules/dagstermill/dagstermill_tests/test_basic_dagstermill_solids.py#L17

mgasner commented 5 years ago

cc @schrockn

mgasner commented 5 years ago

https://github.com/nteract/papermill/issues/338

mgasner commented 4 years ago

"The second issue was an exception thrown from papermill here in papermill/execute.py

# Fetch the kernel name if it's not supplied
kernel_name = kernel_name or nb.metadata.kernelspec.name

I was able to work around this by injecting a kernel name in the ipynb. I created the notebook with dagstermill cli and it didnt have this. It would be nice if kernel name could be specified as config to the solid. edit: the exception was a key error that 'kernelspec' didnt exist. (edited) "

mgasner commented 4 years ago

On part 2) here, this is most like "engine" config and I lean toward letting dagstermill solids take an additional config block (perhaps under the key 'dagstermill') to specify that they be executed with a given kernelspec.