Galileo-Galilei / kedro-mlflow

A kedro-plugin for integration of mlflow capabilities inside kedro projects (especially machine learning model versioning and packaging)
https://kedro-mlflow.readthedocs.io/
Apache License 2.0
199 stars 32 forks source link

Handling of Exceptions in MLPipeline #377

Open daniel-ressi opened 1 year ago

daniel-ressi commented 1 year ago

Description

Errors in the MLPipeline are overshadowed by a NotImplementedError Exception, which makes debugging more complex than necessary

Context

This bug occurs only if there is an Exception in the MLPipeline.training pipeline. It is not critical as the relevant Error message is still shown above

Steps to Reproduce

If required I can prepare a better example, but this should actually be enough to reproduce the issue.

  1. Add raise ValueError("My debug message") to any node which is part of an MLPipeline (training) using kedro > 0.11

Expected Result

I expect a ValueError to be raised with "My debug message". In addition kedro provides a resume from nodes preview functionality. And this is actually cause of the issue.

Actual Result

During handling of the above exception, another exception occurred


╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ kedro:8 in <module>                     │
│                                                                                                  │
│   5 from kedro.framework.cli import main                                                         │
│   6 if __name__ == '__main__':                                                                   │
│   7 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 8 │   sys.exit(main())                                                                         │
│   9                                                                                              │
│                                                                                                  │
│ python3.9/site-packages/kedro/framework/cli/cli. │
│ py:211 in main                                                                                   │
│                                                                                                  │
│   208 │   """                                                                                    │
│   209 │   _init_plugins()                                                                        │
│   210 │   cli_collection = KedroCLI(project_path=Path.cwd())                                     │
│ ❱ 211 │   cli_collection()                                                                       │
│   212                                                                                            │
│                                                                                                  │
│ python3.9/site-packages/click/core.py:1130 in    │
│ __call__                                                                                         │
│                                                                                                  │
│ python3.9/site-packages/kedro/framework/cli/cli. │
│ py:139 in main                                                                                   │
│                                                                                                  │
│   136 │   │   )                                                                                  │
│   137 │   │                                                                                      │
│   138 │   │   try:                                                                               │
│ ❱ 139 │   │   │   super().main(                                                                  │
│   140 │   │   │   │   args=args,                                                                 │
│   141 │   │   │   │   prog_name=prog_name,                                                       │
│   142 │   │   │   │   complete_var=complete_var,                                                 │
│                                                                                                  │
│ python3.9/site-packages/click/core.py:1055 in    │
│ main                                                                                             │
│                                                                                                  │
│ python3.9/site-packages/click/core.py:1657 in    │
│ invoke                                                                                           │
│                                                                                                  │
│ python3.9/site-packages/click/core.py:1404 in    │
│ invoke                                                                                           │
│                                                                                                  │
│ python3.9/site-packages/click/core.py:760 in     │
│ invoke                                                                                           │
│                                                                                                  │
│ python3.9/site-packages/kedro/framework/cli/proj │
│ ect.py:366 in run                                                                                │
│                                                                                                  │
│   363 │   node_names = _get_values_as_tuple(node_names) if node_names else node_names            │
│   364 │                                                                                          │
│   365 │   with KedroSession.create(env=env, extra_params=params) as session:                     │
│ ❱ 366 │   │   session.run(                                                                       │
│   367 │   │   │   tags=tag,                                                                      │
│   368 │   │   │   runner=runner(is_async=is_async),                                              │
│   369 │   │   │   node_names=node_names,                                                         │
│                                                                                                  │
│ python3.9/site-packages/kedro/framework/session/ │
│ session.py:407 in run                                                                            │
│                                                                                                  │
│   404 │   │   )                                                                                  │
│   405 │   │                                                                                      │
│   406 │   │   try:                                                                               │
│ ❱ 407 │   │   │   run_result = runner.run(                                                       │
│   408 │   │   │   │   filtered_pipeline, catalog, hook_manager, session_id                       │
│   409 │   │   │   )                                                                              │
│   410 │   │   │   self._run_called = True                                                        │
│                                                                                                  │
│ python3.9/site-packages/kedro/runner/runner.py:8 │
│ 8 in run                                                                                         │
│                                                                                                  │
│    85 │   │   │   self._logger.info(                                                             │
│    86 │   │   │   │   "Asynchronous mode is enabled for loading and saving data"                 │
│    87 │   │   │   )                                                                              │
│ ❱  88 │   │   self._run(pipeline, catalog, hook_manager, session_id)                             │
│    89 │   │                                                                                      │
│    90 │   │   self._logger.info("Pipeline execution completed successfully.")                    │
│    91                                                                                            │
│                                                                                                  │
│ python3.9/site-packages/kedro/runner/sequential_ │
│ runner.py:73 in _run                                                                             │
│                                                                                                  │
│   70 │   │   │   │   run_node(node, catalog, hook_manager, self._is_async, session_id)           │
│   71 │   │   │   │   done_nodes.add(node)                                                        │
│   72 │   │   │   except Exception:                                                               │
│ ❱ 73 │   │   │   │   self._suggest_resume_scenario(pipeline, done_nodes, catalog)                │
│   74 │   │   │   │   raise                                                                       │
│   75 │   │   │                                                                                   │
│   76 │   │   │   # decrement load counts and release any data sets we've finished with           │
│                                                                                                  │
│ python3.9/site-packages/kedro/runner/runner.py:1 │
│ 86 in _suggest_resume_scenario                                                                   │
│                                                                                                  │
│   183 │   │   postfix = ""                                                                       │
│   184 │   │   if done_nodes:                                                                     │
│   185 │   │   │   node_names = (n.name for n in remaining_nodes)                                 │
│ ❱ 186 │   │   │   resume_p = pipeline.only_nodes(*node_names)                                    │
│   187 │   │   │   start_p = resume_p.only_nodes_with_inputs(*resume_p.inputs())                  │
│   188 │   │   │                                                                                  │
│   189 │   │   │   # find the nearest persistent ancestors of the nodes in start_p                │
│                                                                                                  │
│ python3.9/site-packages/kedro_mlflow/pipeline/pi │
│ peline_ml.py:173 in only_nodes                                                                   │
│                                                                                                  │
│   170 │   │   )                                                                                  │
│   171 │                                                                                          │
│   172 │   def only_nodes(self, *node_names: str) -> "Pipeline":  # pragma: no cover              │
│ ❱ 173 │   │   raise NotImplementedError(MSG_NOT_IMPLEMENTED)                                     │
│   174 │                                                                                          │
│   175 │   def only_nodes_with_namespace(                                                         │
│   176 │   │   self, node_namespace: str                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
NotImplementedError: This method is not implemented because it does not make sense for 'PipelineML'. Manipulate directly the training pipeline and recreate the 'PipelineML' with 'pipeline_ml_factory' factory.

Your Environment

Include as many relevant details about the environment in which you experienced the bug:

Does the bug also happen with the last version on master?

Yes, tried it out

daniel-ressi commented 1 year ago

Thank you already for your support. My suggestion would be to just call only_nodes on the training pipeline of the MLPipeline.

Galileo-Galilei commented 1 year ago

Hi @daniel-ressi, you're not the first one to notice this behaviour. Unfortunately, kedro filters your pipeline to suggest a resume scenario, and this breaks PipelineML object. This is the correct behaviour: you should not use the suggested command because it will not work with PipelineML which assumes you are running the entire pipeline and not part of it.

However, given how annoying this stacktrace is, I am considering changing the behaviour and only issuing a warning. The risk is that some people will run their entire pipeline before noticing PipelineML object does not work as intended.

I will try to find a way to not hinder the entire stacktrace, but I have no straighforward solution for now, sorry.

daniel-ressi commented 1 year ago

thanks for you swift response. Is the issue that kedro's resume scenario would relate to to running only the training pipeline and not the PipelineML? I would upvote a solution that just warns the user about these implications.

I guess ideally it would be possible to disable the resume scenario suggestion for a PipelineML run, but this seems not possible as it's not called through a hook butwith the Runner.

Eitherway great work @Galileo-Galilei !