Open mirekphd opened 3 years ago
[... ] will be unacceptable in any corporate security-restricted environment [...]
You are absolutely right. To address this container images being used to run notebook or Python scripts can also be stored in private/local container registries, as mentioned here: https://elyra.readthedocs.io/en/latest/user_guide/runtime-image-conf.html#prerequisites. I just did notice this in the documentation, which we need to fix.
[...] if there is a truly local option - running in the same docker container (supplied and approved by the corporation) where Jupyter Lab is being executed?
Yes. The "local" runtime configuration is provided for that exact purpose. It is defined by default (unlike Kubeflow Pipelines configuration), and "local" refers to the machine where JupyterLab is running. We currently don't have this documented at https://elyra.readthedocs.io/en/latest/user_guide/runtime-conf.html, but probably should.
Transferring the issue to elyra-ai/elyra.
@ptitzler thank you for addressing both our concerns!
So if the docker image running Jupyter and Elyra can be completely user-defined and generic (i.e. without any custom runtime applications) and can be scanned for vulnerabilities and perhaps even hosted in an internal image registry (like Red Hat Container Registry), then it looks like "problem solved" and just a documentation issue:) I will give it a solid test drive in our on-prem Openshift installation when the time permits (which is most likely next weekend:)
Absolutely! We do publish an official Elyra container image based on the files in https://github.com/elyra-ai/elyra/tree/master/etc/docker/elyra that you could use as a baseline. Feel free to open an issue if you do run into trouble getting an image to work or reach out on https://gitter.im/elyra-ai/community. @lresende, @akchinSTC fyi
Opened https://github.com/elyra-ai/elyra/issues/1350 to improve the content for custom Elyra container images
Opened #1350 to improve the content for custom Elyra container images
Perfect, so I will report any problems I might encounter with the Elyra container under Openshift in this new issue.
I should probably mention that Elyra can be installed as part of Open Data Hub on Red Hat OpenShift: https://elyra.readthedocs.io/en/latest/recipes/deploying-elyra-with-opendatahub.html
I should probably mention that Elyra can be installed as part of Open Data Hub on Red Hat OpenShift: https://elyra.readthedocs.io/en/latest/recipes/deploying-elyra-with-opendatahub.html
Yes, I've heard about it from Red Hat people, but the only problem (as with Kubeflow on its own) is our obsolete Openshift installation (still running 3.11).
I turns out (my apologies) that container images built using https://github.com/elyra-ai/elyra/blob/master/etc/docker/elyra/Dockerfile won't work on RHOS. We've still got some work to do documenting how we are building the image for Open Data Hub.
how we are building the image for Open Data Hub.
I understand you completely, it's never been easy to port containerized apps from Docker to Openshift due to additional security considerations (Jupyter Notebook was I think the only one working there out of the box:)
I can't seem to find the "local" runtime option. Is it at all possible to use Elyra, but disable all containerization options and just use the available (Python) kernels on the host that runs JupyterLab?
I can't seem to find the "local" runtime option.
To use the local runtime you must create pipelines using the generic pipeline editor:
When you click "run" the option is displayed:
This option is unavailable in the Kubeflow Pipelines and Airflow pipeline editor.
Is it at all possible to use Elyra, but disable all containerization options and just use the available (Python) kernels on the host that runs JupyterLab?
Not currently. The options for Airflow and Kubeflow Pipelines are always displayed by the UI.
This "call home" (
docker pull
in https://github.com/elyra-ai/examples/blob/master/pipelines/dax_noaa_weather_data/analyze_NOAA_weather_data.pipeline#L17 of the "Pandas" docker image (https://hub.docker.com/r/amancevice/pandas/) from your Docker Hub account) will be unacceptable in any corporate security-restricted environment, where only security-approved docker containers and image registries are permitted (even if not running on completely air-gapped servers). In fact one should avoid anydocker pull
operations whatsoever (as this is what was assumed to work here), as these require root-level priviledges and will be unlikely to work (unless the notebook is run as root).I might have missed it from reading the docs - please advise if there is a truly local option - running in the same docker container (supplied and approved by the corporation) where Jupyter Lab is being executed? I mean it is a false premise that this environment must be resource-poor. Jupyter client stays in a browser on a thin client machine, correct, but python kernel is nearly always run on the server-side, on a large compute node. No need to improve on this client-server arch that already works fine for individual Notebooks. Just let the user scripts (pipeline code payload) run on the same machine (but dedicated python kernel!) as the controlling notebook. This is how
papermill
works by the way (no dependency on external docker images).