os-climate / aicoe-osc-demo

This repository is the central location for the demos the ET data science team is developing within the OS-Climate project. This demo shows how to use the tools provided by Open Data Hub (ODH) running on the Operate First cluster to perform ETL, create training and inference pipelines.
Apache License 2.0
10 stars 24 forks source link

Couldn't add kubeflow pipeline runtime configuration #228

Closed Jeevi281 closed 1 year ago

Jeevi281 commented 1 year ago

Describe the bug

I couldn't able to add kubeflow runtime to run the pipeline.

To Reproduce

  1. After creating the pipline like this

Screenshot (209)

  1. Go to 'runtime' on the left side bar
  2. Click on ' + icon' on right upper corner
  3. Select 'New Kubeflow Pipeline runtime configuration'
  4. After entering all the details to create runtime, click save & close

Screenshot (211)

  1. Still the runtime is not added here Screenshot (213)

Expected behavior A clear and concise description of what you expected to happen.

If the runtime is added, I can be able to run the pipeline

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

Jeevi281 commented 1 year ago

@Shreyanand

schwesig commented 1 year ago

see also YouTube comment: https://www.youtube.com/watch?v=lGeT615YNlM&lc=UgxmpMwWqw-345wvcRx4AaABAg

Shreyanand commented 1 year ago

Thanks for adding the details @Jeevi281. It seems that the cloud object storage and password are not added in the configuration. In the docs there is an example. Do you have credentials for any s3 bucket that you could use? If not, you could create an issue here to get the credentials for the os-climate bucket.

Jeevi281 commented 1 year ago

Thanks @Shreyanand. I added the configuration as you said, the runtime is added now. Though, I couldn't able to see status in run details

To Reproduce

  1. After clicking run pipeline with the created kubeflow pipeline configuration, the below dialogue box is popped.

Screenshot (225)

  1. Click 'Run Details'
  2. The below screen will appear. Screenshot (223)

Expected Behaviour To be able to see run details

MichaelTiemannOSC commented 1 year ago

What's happening here is that *.cluser.local is entirely within the kubeflow world--there's no externally valid route point. There are various services that can peer into the kubeflow world and report things out--especially if you have admin privs. But generally speaking, the environments of the images you create and run are not publicly visible.

There is a whole paradigm around how these pipelines operate which is really quite different than the conventional Unix pipeline (e.g. tool1 | tool2 | tool3 all operating in a consistent, persistent environment).

Separately, and especially to any AI-CoE team members reading this: #115 . Please, let's fix our templates, documentation, etc., so that we don't wind up with literally thousands of junk execution runs in the top-level of the bucket intended for landing data for all to use.

Shreyanand commented 1 year ago

The UI dashboard for kubeflow is available at this location. You can track your runs here. The *.cluster.local endpoint used in the configuration is internal and the Elyra UI uses that to generate the alert box and so it leads to page not found.

Jeevi281 commented 1 year ago

Resolved! Thanks @Shreyanand @MichaelTiemannOSC