elyra-ai / elyra

Elyra extends JupyterLab with an AI centric approach.
https://elyra.readthedocs.io/en/stable/
Apache License 2.0
1.86k stars 344 forks source link

gcc on `kf-notebook` image #2924

Open dvaldivia opened 2 years ago

dvaldivia commented 2 years ago

Some python packages need gcc to install due to their build wheels needing it for installation, for example numpy requires pycocotools which requires gcc to complete installation. As elyra/kf-notebook:3.11.0 doesn't ship with gcc I was wondering if making it standard on the container would be a good idea.

The workaround is to run the notebook as root, but that can't be easily achieved from the kubeflow UI and requires some kubernetes hacking so it might spook some data scientist hahaha

ptitzler commented 2 years ago

Some background information about our container images that might be helpful in this context.

Both, the elyra and the kf-notebook container images (See the directories in https://github.com/elyra-ai/elyra/tree/main/etc/docker for how they are built), are not meant to be one-size-fits-all production ready solutions, because user requirements can vary widely. Just to give you two examples in https://github.com/elyra-ai/elyra/blob/main/etc/docker/kubeflow/Dockerfile, which is used to build the kf-notebook image:

So unless one uses Tekton or the example components the published image isn't ideal and users would be better off to build a customized container image that doesn't expose functionality that is never used.

That said, a while back we did discuss briefly whether or not to publish different flavors of those images (e.g. a -slim which only include the bare necessities) but haven't come to an agreement. Having an image that does include gcc might be an option.

The other noteworthy aspect is that the images are built on top of other images that we don't maintain (see https://github.com/elyra-ai/elyra/blob/main/etc/docker/kubeflow/Dockerfile#L20 for kf-notebook's parent image). What's pre-installed in those images is not in our control and might change over time. (I've just realized that a v1.5 Jupyter image is available that we might have to take a look at what has changed and wether an upgrade is warranted.)

akchinSTC commented 2 years ago

are not meant to be one-size-fits-all production ready solutions

Agreed. All our provided images were intended to be merely base/example images that users can use to then extend into a more bespoke image tailored to their own needs.

That said, a while back we did discuss briefly whether or not to publish different flavors of those images (e.g. a -slim which only include the bare necessities) but haven't come to an agreement.

This would be ok, but we would need to figure out what packages, would be deemed necessary e.g. system level vs python. IIRC, gcc is arch specific and x86/amd64 would cover most of the use cases but we would need to cut add. images in the future for other minimal base images.

ptitzler commented 2 years ago

I was thinking we could perhaps start by publishing elyra-slim / kf-notebook-slim which would not include any of the extras that the current elyra / kf-notebook pull in.