Closed notartom closed 5 years ago
LGTM
I said 2 machines, and for the initial PoC that's all I need, but I didn't consider the scale implications of eventually running a voting Nova CI job on this. I'm guessing 2 machines won't be enough, but I don't currently know how many would actually be needed. It's something that I'll need to research.
A balance will also have to be struck between good coverage and hardware requirements - but we can cross that bridge when we get there.
+1, looks standalone physical machines better for this case as third part CI for OpenStack infra, @mrhillsman would you please allocate two physical machines for @notartom from you resource pool?
We (nova) likely don't need a 3rd party CI job that runs on every proposed nova change like we have in the check queue today. Starting with a periodic job would be good enough IMO, or something that we can run on demand like the "experimental" queue in OpenStack upstream CI would be a great start.
LGTM
After discussing with @mriedem, it makes sense to have OpenLab trigger on the comment "check openlab" vs. each PR or even periodic for now.
Actually, after talking with @SeanMooney, we might want to expand the scope of this slightly, and make it more of a NFV CI than just NUMA CI. For that the servers would need SRIOV-capable network cards, and they've have to be baremetal.
ack @notartom
One thing I'd like to avoid is having to become a sysadmin for an OpenStack cloud, so I'm wondering how the hardware (assuming we get it) is going to be presented. Ideally it would be in the shape of an OpenStack cloud that we could just point Nodepool towards. If they're going to be just machines with SSH access I'm not sure we have the human resources to run and maintain a cloud on them.
But @notartom the boxes are already en route to your house?!
But @notartom the boxes are already en route to your house?!
Couple of things:
@notartom Matt is kidding with you! :)
/unassign
No activity, feel free to reopen it.
If you are interested in testing and improving support for the cloud-related SDKs/Tools as well as platforms in OpenLab, please fill out the details below. You can always find more information about OpenLab at https://openlabtesting.org
What is your focus?
If this is for an open source project what is it?
OpenStack Nova
Brief project description
OpenStack Nova provides a cloud computing fabric controller.
Is project code 100% open source? If so, what is the URL or URLs where it is located?
Yes, https://github.com/openstack/nova
What kind of machines (VMs or Baremetal) and how many do you expect to use?
We would need either 2 VMs with nested virtualisation or physical machines. Each would need multiple NUMA nodes, and one needs more NUMA nodes than the other. Hugepages need to be enabled on both machines, though their size isn't important (1G or 2M is fine). For reference, the current environment that I use to develop and test the feature has:
Controller/allinone VM:
Compute VM:
What OS are you planning to use?
Fedora 29? Honestly it doesn't really matter as long as it can run devstack reliably. If nested virt ends up being used, the baremetal host will need good nested virt support, in which case F29 is probably the best choice, as Ubuntu seems to have issues. Vexxhost have reported reliable nested virt with Centos 7 as well.
Any special network configuration you expect or anticipate implementing?
N/A
Any architecture or other specifications/requirements (CPU, RAM, GPU, etc)?
See section 4.
What testing are you planning to implement or need assistance implementing?
NUMA live migration testing, and more generally re-introduce a NUMA CI to fill the hole left by the de facto abandonment of the Intel NFV CI.
To that end, I intend to use a tempest plugin (that I contribute heavily to) that will allow me to assert things that are outside of Tempest's scope. The actual test run will look like a normal Tempest test run.
How will this testing advance application and/or tooling built on-top of open infrastructure?
This will allow Nova to proceed more confidently with any NUMA-related feature and/or bug.
Will you publish blog or paper from your testing?
The intent is for this to eventually become a voting job in Nova.
Any other relevant details we should know about while preparing the infrastructure?
N/A