kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.61k stars 1.63k forks source link

[feature] Show task and pipeline status and events in webinterface for v2 pipelines #10634

Open geier opened 7 months ago

geier commented 7 months ago

Feature Area

/area frontend

What feature would you like to see?

Show the status and events of the pod for each task and the pipeline itself.

What is the use case or pain point?

In kfp v2 the webinterface doesn't show if a component of a pipeline fails to start, e.g. because of an error pulling an image. For users not very familiar with kubernetes, this is hard to debug, they just see a "running" pod in the user interface, but don't understand that there is an error that needs to be taken care of.

Example

Example pipeline that errors out (the alpinexxxxx image does not exist):


from kfp import dsl
from kfp.client import Client

@dsl.container_component
def say_hello():
    return dsl.ContainerSpec(image='alpinexxxxx', command=['echo'], args=['Hello'])

@dsl.pipeline
def hello_pipeline():
    say_hello()

client = Client()
client.create_run_from_pipeline_func(hello_pipeline)

The user interface suggests the task is running:

image

Not helpful message in Details either:

image

k9s shows what the issue is:

image

Is there a workaround currently?

Use a tool such as Lens or k9s to have a look at the status and events of your pods, but this isn't great for users more focussed on the Data Science part.


Love this idea? Give it a 👍.

github-actions[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 4 months ago

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

HumairAK commented 2 months ago

this is very much relevant today, requiring the user to go into k8s and manually inspect pods is not good at all

/reopen

google-oss-prow[bot] commented 2 months ago

@HumairAK: Reopened this issue.

In response to [this](https://github.com/kubeflow/pipelines/issues/10634#issuecomment-2313024009): >this is very much relevant today, requiring the user to go into k8s and manually inspect pods is not good at all > >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

HumairAK commented 2 weeks ago

/remove-lifecycle stale