cncf / cluster

🖥🖥🖥🖥CNCF Community Cluster
https://cncf.io/cluster
155 stars 38 forks source link

Server resource request for running Metal3.io CI #266

Open kashifest opened 6 months ago

kashifest commented 6 months ago

First and Last Name

Kashif Khan

Email

kashif.khan@est.tech

Company/Organization

Ericsson Software Technology

Job Title

Product Owner

Project Title (i.e., a summary of what do you want to do, not what is the name of the open source project you're working with)

Project Name: Metal3.io (https://metal3.io/ , https://github.com/metal3-io)

Plans to utilize the hardware for:

Briefly describe the project (i.e., what are the details of what you're planning to do with these servers?)

Is the code that you're going to run 100% open source? If so, what is the URL or URLs where it is located? What is your association with that project?

Yes the code is 100% open source . Here is again the link of the project's github organization https://github.com/metal3-io. I am one of the maintainer. Here is the list of the other maintainers: https://github.com/metal3-io/community/blob/main/maintainers/ALL-OWNERS

What kind of machines and how many do you expect to use (see: https://deploy.equinix.com/product/bare-metal/servers/)?

6 servers of the type m3.small.x86

What operating system and networking are you planning to use?

CentOS and Ubuntu

Any other relevant details we should know about?

It would be good to allow access to all the maintainers mentioned here https://github.com/metal3-io/community/blob/main/maintainers/ALL-OWNERS

idvoretskyi commented 6 months ago

@kashifest before we'll proceed, can't you use GitHub Actions for the CI purposes instead (as the CIL resources can be limited) - https://github.com/cncf/cluster?tab=readme-ov-file#usage-guidelines

kashifest commented 6 months ago

@kashifest before we'll proceed, can't you use GitHub Actions for the CI purposes instead (as the CIL resources can be limited) - https://github.com/cncf/cluster?tab=readme-ov-file#usage-guidelines

@idvoretskyi we usually run github workflows for smaller jobs, linters etc, usually our CI is resource intensive and thus dedicated hardware would be needed.

P.S: in the link provided, when I click this link I dont see any example as it says see example below. Any idea where can I get more info on this?

idvoretskyi commented 6 months ago

P.S: in the link provided, when I click this link I dont see any example as it says see example below. Any idea where can I get more info on this?

@jeefy may assist here, also an ARC cluster has been spun up that could be used for the similar stuff.

kashifest commented 6 months ago

@jeefy @idvoretskyi we are meanwhile testing the github actions with large runners as you suggested, unfortunately the equinix ones are offline and not usable at the moment. We used the default large runners for one of our e2e tests, we see that the usage shows some billable hours, any idea what does that mean to us? Is the bill going to CNCF since our github org is part of CNCF enterprise and we dont need to worry about it? We are still unsure how much would this be a performance issue once we start putting the e2e PR jobs on these runner cause I can see that getting an available runner is random, For our other repos the e2e tests are bigger in resource requirements so for those we would anyhow need some dedicated resource. Screenshot 2024-05-15 at 13 48 47

kashifest commented 6 months ago

@jeefy @idvoretskyi any updates?

idvoretskyi commented 5 months ago

@kashifest yep, apologies for the delay. Working on this internally.

jeefy commented 5 months ago

re: earlier comment, no worries about the billable hours with metal3's Org being under the CNCF GH Enterprise license.

The Equinix and Oracle runners should be available for use if you wouldn't mind trying those again. :) If you run into issues or need a different shape/size please let us know.

If those don't work we can set you up in Equinix, we'll just need you to only use resources when needed, so spinning them up/down automatically.

kashifest commented 5 months ago

@jeefy Thanks a lot for replying back.

The Equinix and Oracle runners should be available for use if you wouldn't mind trying those again. :) If you run into issues or need a different shape/size please let us know.

I dont think its working or then I might have configured something wrongly. The job is waiting for 40 minutes or so for runner to become available. Check the attachment Screenshot 2024-06-07 at 10 53 40

Here is the PR: https://github.com/metal3-io/baremetal-operator/pull/1775

If those don't work we can set you up in Equinix, we'll just need you to only use resources when needed, so spinning them up/down automatically.

I believe this would be needed anyhow for larger e2e jobs in our org anyhow.