networkservicemesh / integration-k8s-packet

Apache License 2.0
0 stars 9 forks source link

SR-IOV enablement on Equinix Metal n3 servers #295

Closed vielmetti closed 2 years ago

vielmetti commented 2 years ago

This is a heads up to let the project know that Equinix Metal is planning to make SR-IOV a default configuration on our new n3 class of servers.

The net effect of this should be that rather than test against a fixed pool of dedicated servers, the project will be able to draw from our server pool. In addition any specific configuration for turning SR-IOV on will be simplified.

The release is forthcoming and there are still a few small details to work out. When the formal announcement drops I'll use this item to update implmentation details for the project.

(I know that you all are in the middle of a release right now so no expectations for any immediate changes until that's taken care of!)

cc @Bolodya1997 @edwarnicke

vielmetti commented 2 years ago

the announcement:

https://feedback.equinixmetal.com/changelog/sr-iov-enabled-by-default-on-n3xlarge-servers

glazychev-art commented 2 years ago

@vielmetti Thanks for the announcement! Will this somehow affect n2.xlarge? Because as far as I know we use this type.

vielmetti commented 2 years ago

Thanks @glazychev-art .

The intent is as follows:

NSM will port the software to our new n3 system. The server config is here: https://metal.equinix.com/product/servers/n3-xlarge/ with a notable difference that the NIC is the Intel E810 (with the 'ice' driver)

When you're happy with the port, you'll switch your CI to use n3 systems "on demand" for testing. No need to reconfigure the systems for SR-IOV because that is turned on.

When everything is working to specification, you'll release the old n2 systems and no longer use them.

Hope this makes it more clear!

glazychev-art commented 2 years ago

@vielmetti @edwarnicke Do I understand correctly that NSM will no longer have Reserved servers? And only On Demand servers will be used?

I ran our tests on n3-xlarge today - in general, they are working fine, maybe some minor updates to our scripts will be needed. Testing it.

edwarnicke commented 2 years ago

@glazychev-art The hope is to be able to switch to on demand n3 servers yes :) If that works (and I expect it will) we can then relinquish our reserve instances :)

glazychev-art commented 2 years ago

@edwarnicke There may be some difficulty:

  1. Now we are using cloudtest, and it says that we need to use hardware_reservations for the packet cluster. That is, with on-demand clusters, we abandon the cloudtest. Therefore, it is worth considering switching to cluster-api
  2. In cloudtest, in addition to installing kubernetes, we also have scripts for configuring SRIOV. But the cluster-api only creates a kubernetes-cluster. We need to investigate if we can use ssh and do whatever we want on the server after deploying via cluster-api.
  3. I tried using cluster-api according to this tutorial, but for now there are problems with it. For example, clusterclt does not see that the server has already been created and continues to wait for it, although it is successfully displayed in the console.equinix. I used version v0.5.0

Do we need to move to a cluster-api instead of cloudtest?

cprivitere commented 2 years ago

I believe this is due to a known issue I've got a fix for in the upcoming 0.6.0 version.

You can try manually editing your cluster yaml to use cloud-provider-equinix-metal version 0.4.3 as well as adding back in the systemctl restart networking to the line above if [ -f "/run/kubeadm/kubeadm.yaml" ]; then

To see an example, check the commits in this PR: https://github.com/kubernetes-sigs/cluster-api-provider-packet/pull/365/files

vielmetti commented 2 years ago

@glazychev-art - I checked with our team and they referred me to this patch to cluster-api that addresses the clusterctl problem you mention.

https://github.com/kubernetes-sigs/cluster-api-provider-packet/pull/365

The relevent bit is this

Without this PR clusters never come up.

You can manually apply this to a 0.5.0 generated template as a workaround until 0.6.0 comes out.

glazychev-art commented 2 years ago

@cprivitere @vielmetti Thanks guys, I really appreciate your help!

I noticed a strange thing - the last couple of runs are deploying fine even on version 0.5.0. But, of course, if I see it again - I will try what you suggested!

I have a couple more questions to discuss:

  1. As far as I understand, when using clusterctl we have to install CNI after the control plane node is deployed. Is there a clusterctl command (something like kubectl wait...) that will help determine that the control plane is ready for CNI installation? This is for the script.
  2. In our version of the kubernetes setup, we used docker as a container runtime. As far as I understand, this was necessary for setting default limits (we need this for the test) - https://github.com/networkservicemesh/integration-k8s-packet/blob/main/scripts/k8s/config-docker.sh. Installation via clusterctl uses a containerd that does not allow to make such settings. Do I understand correctly that there is no way to use docker instead of a containerd?

I would be grateful if you have any thoughts on this!

vielmetti commented 2 years ago

Regarding containerd and ulimits, these look like the relevant issues:

I don't know the exact syntax to pull these into your configuration, but that should be a good start for seeing what support is already there.

cprivitere commented 2 years ago

Yeah, all the templates we maintain use containerd, you'd need to edit/override the cloud-init portions of the generated cluster yaml if you wanted to use docker instead of containerd.

For the CNI installation, cluster api has a resource called ClusterResourceSet that can be used to automatically install the CNI in a new cluster. Here's the proposal doc: https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20200220-cluster-resource-set.md. We actually have a template that uses a CRS to install Calico that you can check out here: https://github.com/kubernetes-sigs/cluster-api-provider-packet/blob/v0.5.0/templates/cluster-template-crs-cni.yaml.

If you want to try using it, pass --flavor=crs-cni when you do the clusterctl generate command.

vielmetti commented 2 years ago

@edwarnicke thanks and good to hear that the new systems are up and running.

Can you confirm that the two old n2 systems can be removed? They are currently in red (failed) status in my dashboard, so I know you're not actively using them right now, but I'd like to clean them up.