kedro-org / kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
https://kedro.org
Apache License 2.0
9.88k stars 893 forks source link

Make it possible to use Kedro after uninstalling Rich. #3838

Closed lrcouto closed 4 months ago

lrcouto commented 5 months ago

Description

Allows Kedro to work regardless of the rich library being installed. At this moment, installing Kedro will still install rich, but the user will have the option to remove the library and still have everything work.

Because cookiecutter versions 2.3 and above depend on rich, it is also necessary to remove the version installed automatically by Kedro and install a compatible one to be able to run without rich.

Logging, prompts, and the kedro ipython command will work without rich given a compatible cookiecutter version is also being used. Our own unit tests will not, they will fail without rich.

Development notes

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

SajidAlamQB commented 5 months ago

Hey @lrcouto, this is an interesting approach, I think we will also need to modify kedro/framework/project/default_logging.yml, kedro/templates/project/{{ cookiecutter.repo_name }}/conf/logging.yml as they both also point to rich.

astrojuanlu commented 5 months ago

The thing that user cannot change now is all the traceback error / CLI that ties with rich. Essentially one cannot run Kedro without rich.

Don't quite understand that part, could you elaborate @noklam ?

noklam commented 5 months ago

@astrojuanlu Alright I was confused because I see no changes on these: https://github.com/kedro-org/kedro/blob/9da8b37802d760b2b96ca4d69ebdb7462f12370f/kedro/logging.py#L35 and https://github.com/kedro-org/kedro/blob/9da8b37802d760b2b96ca4d69ebdb7462f12370f/kedro/logging.py#L56-L60

Now I realise the logic is in the early return.

My question is, why do we want to be clever about the RichHandler? Alternatively, we can have another "default" setting, and if rich is not importable we simply use another default. One thing that doesn't work is if conf/logging.yml still have rich enabled.

In that case, I think it's reasonable to fail rather than being smart about it.

https://github.com/kedro-org/kedro/blob/9da8b37802d760b2b96ca4d69ebdb7462f12370f/kedro/framework/project/__init__.py#L220-L224

lrcouto commented 5 months ago

@astrojuanlu Alright I was confused because I see no changes on these:

https://github.com/kedro-org/kedro/blob/9da8b37802d760b2b96ca4d69ebdb7462f12370f/kedro/logging.py#L35

and

https://github.com/kedro-org/kedro/blob/9da8b37802d760b2b96ca4d69ebdb7462f12370f/kedro/logging.py#L56-L60

Now I realise the logic is in the early return.

My question is, why do we want to be clever about the RichHandler? Alternatively, we can have another "default" setting, and if rich is not importable we simply use another default. One thing that doesn't work is if conf/logging.yml still have rich enabled.

In that case, I think it's reasonable to fail rather than being smart about it.

https://github.com/kedro-org/kedro/blob/9da8b37802d760b2b96ca4d69ebdb7462f12370f/kedro/framework/project/__init__.py#L220-L224

I was looking into this idea of adding a non-rich default yesterday, but that would mean having to try to import rich on init.py to check if it is installed, and I was trying to avoid adding more imports. I think it's a valid idea regardless. We could have a non-rich version that gets defaulted to if rich is not present, and that would allow us to not have to mess with the logging handler. What do you think?

lrcouto commented 4 months ago

At the present moment, this PR allows creating a project with kedro new, running a pipeline with kedro run, and using kedro ipython without requiring rich to be installed.

One thing I'm still looking into is that prompts are not using rich even when it's available. I'm not sure why this is happening yet and I'm looking into it. Examples of that behavior can be seen on the project creation and telemetry consent prompts.

lrcouto commented 4 months ago

At the present moment, this PR allows creating a project with kedro new, running a pipeline with kedro run, and using kedro ipython without requiring rich to be installed.

One thing I'm still looking into is that prompts are not using rich even when it's available. I'm not sure why this is happening yet and I'm looking into it. Examples of that behavior can be seen on the project creation and telemetry consent prompts.

I've found the source of this issue - it's cookiecutter. Using the newest version of cookiecutter makes the prompt use rich when it's installed, but then project creation fails when rich is not present. Using an earlier version makes the prompts ignore rich. I'm currently seeing if there's a specific cookiecutter version that does both.

lrcouto commented 4 months ago

I've found the source of this issue - it's cookiecutter. Using the newest version of cookiecutter makes the prompt use rich when it's installed, but then project creation fails when rich is not present. Using an earlier version makes the prompts ignore rich. I'm currently seeing if there's a specific cookiecutter version that does both.

Seeing that is no version of cookiecutter that allows both the prompts to use rich when it's present and work at all when it's not, I'm wondering what should we do with this PR now. I've added some test coverage to allow the PR to pass CI, and the only functionality we'd actually miss by dropping the cookiecutter version and forgoing rich would be using yes/no responses to prompts with spaces and mixed case.

Do we want to sacrifice some aesthetic to allow users to use Kedro without rich? Or should we bring back this discussion together with decoupling Kedro from cookiecutter? Putting it up for review so we can talk about it.

astrojuanlu commented 4 months ago

Ugh, thanks for the investigation @lrcouto. We'll give this some thought.

astrojuanlu commented 4 months ago

In line with https://github.com/kedro-org/kedro/issues/2928#issuecomment-2093316785, maybe let's rename the PR to make it clear that we're not marking rich as an optional dependency yet (and therefore this doesn't yet fully address #2928)