getindata / kedro-kubeflow

Kedro Plugin to support running workflows on Kubeflow Pipelines
https://kedro-kubeflow.readthedocs.io
Apache License 2.0
46 stars 21 forks source link

Prevent hooks firing during using plugin cli #220

Open DmitriyLamzin opened 1 year ago

DmitriyLamzin commented 1 year ago

I believe the plugin should prevent firing pipeline hooks during using its cli, as it creates unpredictable issues.

Example with Mlflow plugin: We use kedro-mlflow plugin We have in the project two envs: local and remote. Remote env contains catalogs proper paths to the data in cloud storage, and the config for mlflow on the cloud and kubeflow config. The developer/cicd doesn't have access from the local machine to the remote mlflow API When the developer/cicd tries to compile a pipeline or upload it using --env remote following happens:

The instance of ContextHelper is created:

  1. Kedro Session is created
  2. Kedro context is initialized
  3. after_context_created hook is triggered
  4. catalog instance is retrieved
  5. after_catalog_created hook is triggered
  6. Mlflow plugin has after_catalog_created hook which tries to access to mlflow API
  7. Developer/cicd does not have access to mlflow API on remote env
  8. compiling/uploading fails

This behavior is not needed and not expected. The kubeflow plugin only needs to get catalog and pipeline parameters and proper KFP configs to create proper KFP.

DmitriyLamzin commented 1 year ago

If this proposal looks fine I'd like to assign this to myself

marrrcin commented 1 year ago

@szczeles / @em-pe what do you think? We've been actually relying on the fact that the hooks are invoked in some projects. Maybe you @DmitriyLamzin could add this as a flag in the CLI on the plugin level, like: kedro kubeflow --disable-hooks compile etc.?