cnti-testcatalog / testsuite

šŸ“žšŸ“±ā˜ŽļøšŸ“”šŸŒ Cloud Native Telecom Initiative (CNTI) Test Catalog is a tool to check for and provide feedback on the use of K8s + cloud native best practices in networking applications and platforms
https://wiki.lfnetworking.org/display/LN/Test+Catalog
Apache License 2.0
169 stars 70 forks source link

Urgent: Move off CNCF Equinix Resources #2013

Open lixuna opened 1 month ago

lixuna commented 1 month ago

Describe the software update Need to move cnf-testsuite out of the CNCF Equinix donated resources

Describe why update is needed Equinix has changed the policy so only CNCF projects are eligible to use the donated resources

Additional context

Tasks

taylor commented 1 month ago

@rannyh LFN and/or community donated resources will need to be allocated for the CNTi project needs. This includes the CI pipeline for GitHub actions.

There were some ideas put forward in Fall of 2023 and the earlier part of 2024 included Circle CI for something cost-effective. There are also some current LFN resources that have been mentioned.

If something can be identified soon, even for temporary use, then the current CI can be worked on to move over with limited interruption. Otherwise the CI will be shutdown until something is found which will be a slow down on integrated contributions (eg. Pull Requests specifically will have a big obstacle).

cc: @martin-mat @Smitholi67 @wavell

lixuna commented 1 month ago

Next steps:

lixuna commented 1 month ago

Draft for hardware requirements started in https://docs.google.com/document/d/1GtOQCCMIusdKkea64m5XBS_0vmiLGd7sgyOsbkU0IiE/edit

taylor commented 1 month ago

Suggested resource requirements for the CNTi project CI runners, based on Equinix machine types:

The Equinix c3.small.x86 is a legacy on-demand server with the following hardware configuration:

taylor commented 1 month ago

Circle CI will probably be the most cost effective for a hosted solution.

lixuna commented 1 month ago

To be discussed/reviewed:

  1. LF IT came back with; GitHub increased the base size of their free runners early this year. Is it possible the increases will allow use of the free GitHub runners?

  2. LFN has a ā€œLab as a Serviceā€ (LaaS) program that is run out of the University of New Hampshire. As an LFN initiative CNTi runners can potentially be hosted under the LFN LaaS program. This is from the LaaS lead at UNH: The specs of the servers in LaaS are a bit larger than the ones listed on Github issues, except for the processor generation, where the LaaS servers have Intel Broadwell generation Xeon CPUs. That would only matter if the testing actually requires instruction sets that were available in that generation (i.e. AVX512).

lixuna commented 1 month ago

@taylor @wavell @denverwilliams @agentpoyo @HashNuke @martin-mat please review the options above

taylor commented 1 month ago
  1. Free GitHub runners They can only be used if test coverage is disabled/removed, including some telecom + networking-focused tests. This will lead to reduced velocity as a result of bugs and other issues as the project moves forward.

  2. ā€œLab as a Serviceā€ (LaaS) program from the University of New Hampshire This should work for the CI system and may allow increasing velocity with more available resources.

  3. CircleCI -- $200-$300/month Another recommended alternative covering velocity requirements

@lilluzzi when will the LaaS systems be available to start using and migration?

lixuna commented 1 month ago

Next Step:

taylor commented 1 month ago

Currently testing UHN systems. Not sure which type should be used nor how many at this point.

taylor commented 1 month ago

@lilluzzi @rannyh FYI, Currently testing UHN systems.

Earlier today the CNCF lab Equinix systems used for this project were stopped. They have been turned back on to allow the CI runners to work and builds to continue. I expect they will be shutdown sooner than later.

taylor commented 1 month ago

@denverwilliams the following systems are ready for further testing

lixuna commented 1 month ago

Mon, May 20 update: Denver is not able to access the systems above via VPN as expected. Debugging with UNH support in progress.

taylor commented 1 month ago

There have been continued issues with using a docker + kind setup directly on the system. As a result we have switched to setting up a VM setup for the docker + kind runner setup which will allow expanded usage of the larger resources on the UNH system.

Vagrant setup for GitHub Action runners is in progress and expected to work in the next 24 hours. After this automated spec testing should speedup.

lixuna commented 4 weeks ago

Suggested resource requirements for the CNTi project CI runners, based on Equinix machine types:

  • Ideal: 7x c3.small.x86 - increases the speed of testing and allows some concurrent testing
  • Dynamic: 6x c3.small.x86 - dynamic, slower but probably better than we have now for much less cost.
  • Adequate: 3 x c3.small.x86 - what is currently in use. hours delay for each test

The Equinix c3.small.x86 is a legacy on-demand server with the following hardware configuration:

  • 1 x IntelĀ® XeonĀ® E-2278G
  • 8 cores @ 3.40 GHz
  • 2 x 480 GB SSD
  • 32 GB RAM

@martin-mat