kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.58k stars 1.61k forks source link

The examples for nvidia-resnet cannot be built using existing scripts. #10815

Open a1856315445 opened 4 months ago

a1856315445 commented 4 months ago

Feature Area

/area backend /area sdk

The examples for nvidia-resnet cannot be built using existing scripts.

What feature would you like to see?

Update existing nvidia-resnet or build a new version

What is the use case or pain point?

This example is based on Resnet on image classification, however, it can not longer be built becuase:

  1. This example is built with the latest tensorRT which is tagged with v2.45.0, however it only can work with v2.6.0 due to code structure chaged after v2.6.0. (https://github.com/triton-inference-server/server/tree/v2.6.0)
  2. This example is still using TensorFlow-19.03 containers. https://github.com/kubeflow/pipelines/blob/master/samples/contrib/nvidia-resnet/components/train/Dockerfile
  3. webapp_launcher, built on top of Ubuntu16.04, has also not been updated. (https://github.com/kubeflow/pipelines/blob/master/samples/contrib/nvidia-resnet/components/webapp_launcher/Dockerfile)
  4. Kubeflow pipeline is about to be updated to v2, while this example is still on v1. (https://github.com/kubeflow/pipelines/blob/master/samples/contrib/nvidia-resnet/pipeline/Dockerfile)

Due to the aforementioned issues, it's not easy for users (like myself) to encounter difficulties in building this example.

Is there a workaround currently?

Due to the aforementioned issues, it is hard to rebuild this example.

The proposed change list

  1. The three items mentioned above: tensorRT SDK, TensorFlow container and webapp_launcher.
  2. Replace kfp v1 with v2
github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 month ago

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

a1856315445 commented 1 week ago

/reopen

google-oss-prow[bot] commented 1 week ago

@a1856315445: Reopened this issue.

In response to [this](https://github.com/kubeflow/pipelines/issues/10815#issuecomment-2367452438): >/reopen Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.