kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.49k stars 875 forks source link

Improve `kedro jupyter setup` with options from `ipykernel install` #3777

Closed Lasica closed 3 months ago

Lasica commented 3 months ago

Description

Some of the additional options in installing kernels could be useful in some contexts. The ipykernel module does this well (see python -m ipykernel install --help.

Context

I encountered an issue when working with pyspark, that it needs to have environment variables correctly pointing to the correct python. Setting those envs in jupyter is possible and there is support for that via ipykernel install --env .... I thought that kedro uses it under the hood to register its own kernel with some edits, but looking at the code it uses _create_kernel custom method that also uses ipykernel install.

Possible Implementation

Pass click extra arguments to the: https://github.com/kedro-org/kedro/blob/00789fa4d5f1ed8734d6e2561db4fd52c3feddc8/kedro/framework/cli/jupyter.py#L158-L169

astrojuanlu commented 3 months ago

Hi @Lasica, thanks for opening this conversation! We are now emphasizing %load_ext kedro.ipython #2777, #3619 and even though the kedro jupyter setup is not going anywhere for now, I don't think we are adding new features to it in the short term.

I'm converting this to a discussion for now.