cncf / cluster

🖥🖥🖥🖥CNCF Community Cluster
https://cncf.io/cluster
152 stars 42 forks source link

Kubeflow Testing Infrastructure #202

Closed charlesa101 closed 1 year ago

charlesa101 commented 2 years ago

Please fill out the details below to file a request for access to the CNCF Community Infrastructure Lab. Please note that access is targeted to people working on specific open source projects; this is not designed just to get your feet wet. The most important answer is the URL of the project you'll be working with. If you're looking to learn Kubernetes and related technologies, please try out Katacoda.

First and Last Name

Charles Adetiloye

Email

charles@mavencode.com

Company/Organization

MavenCode.com

Job Title

MLOps Engineer

Project Title (i.e., a summary of what do you want to do, not what is the name of the open source project you're working with)

Kubeflow Metal Infra

Briefly describe the project (i.e., what is the detail of what you're planning to do with these servers?)

Kubeflow Metal is an open-source Kubernetes-native Machine Learning platform. It is designed to enable end-to-end machine learning workflows - for example, data processing with Python for model training with TensorFlow, and model serving with KServe. The goal of this project is to implement daily E2E tests of Kubeflow release on Equinix Metal to test on-prem deployment scripts for Kubeflow.

Is the code that you’re going to run 100% open source? If so, what is the URL or URLs where it is located? What is your association with that project?

Yes the code is 100% open source I am one of the project maintainers/tech leads, and applying on behalf of Kubeflow on-prem SIG. https://github.com/kubeflow/ https://github.com/kubeflow/testing

What kind of machines and how many do you expect to use (see: https://metal.equinix.com/product/servers/)?

2 worker nodes of kind c3.small.x86 if using a shared CI/CD environment 1 master and 3 workers of of kind c3.small.x86 if using dedicated Kubernetes cluster

What operating system and networking are you planning to use?

Ubuntu 18.04

Any other relevant details we should know about?

As discussed with @vielmetti, in phase 2 we can add support for testing on GPUs.

caniszczyk commented 2 years ago

I'm OK with this but since this is a Google project why isn't GCP fronting infra?

charlesa101 commented 2 years ago

This is not in Google project @caniszczyk

caniszczyk commented 2 years ago

if you have to sign a Google CLA it's a Google project imho! https://github.com/kubeflow/kubeflow/pull/6406/checks?check_run_id=5627031393

We can help you in the short term for sure

On Wed, Mar 23, 2022 at 6:06 PM charles adetiloye @.***> wrote:

This is not in Google project @caniszczyk https://github.com/caniszczyk

— Reply to this email directly, view it on GitHub https://github.com/cncf/cluster/issues/202#issuecomment-1076904504, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPSILMUZRZKCT5ABZPTK3VBOPWBANCNFSM5RLW4QJQ . You are receiving this because you were mentioned.Message ID: @.***>

-- Cheers,

Chris Aniszczyk https://aniszczyk.org

vielmetti commented 2 years ago

If I understand the request correctly, the ask is to add @charlesa101 to the existing project 7772b4fd-926c-48c6-ac50-36461b711b9a (name "Kubeflow Testing Infra") that was previously approved here at #152 in a request from @mameshini . Charles has been working with that team.

jeefy commented 2 years ago

For the short-term, invited charles@mavencode.com to the existing project @vielmetti mentioned above so that we aren't blocking anything.

vielmetti commented 2 years ago

A presentation on "KF-Metal" will be given at OSS NA 2022 in Austin in June.

jeefy commented 2 years ago

With Charles added to the existing Kubeflow testing project, is this considered complete?

vielmetti commented 1 year ago

Yes, this looks complete. (No data center migration needed, this project is already all in IBX data centers). closing as done for now.