nephio-project / nephio

Nephio is a Kubernetes-based automation platform for deploying and managing highly distributed, interconnected workloads such as 5G Network Functions, and the underlying infrastructure on which those workloads depend.
Apache License 2.0
101 stars 53 forks source link

Define and add resources needed for free5GC SMF (to nephio-project/free5gc-packages) #88

Closed s3wong closed 1 year ago

s3wong commented 1 year ago

free5gc SMF packages now have the following definitions: here

More information is definitely needed: list of UPFs, sNssai info...etc. Add them into the package

henderiw commented 1 year ago

this is depending on this: https://github.com/nephio-project/api/pull/17

johnbelamaric commented 1 year ago

@henderiw to discuss with @s3wong

johnbelamaric commented 1 year ago

@s3wong @tliron @henderiw @n2vo @denysaleksandrov

The issue we discussed in the meeting yesterday is that the SMF configuration needs many details from the config of each connected UPF. Here are the options we discussed yesterday, please correct as needed:

gvbalaji commented 1 year ago

Implementing a specializer may be the easiest option for R1.

johnbelamaric commented 1 year ago

Can we agree what the final state of the data in the SMF package looks like, and then how that is reflected in the SMF package so the operator changes can be unblocked? @s3wong @n2vo @henderiw

henderiw commented 1 year ago

The service discovery creates an additional dependency on a register that need to be accessible from the various locations. From my experience many people in telco are not keen on this given this dependency. They even try to avoid basic things like DNS/DHCP. So something we have to keep in mind. Also there are very good solutions in the market for this, but our main dependency is free5gc here. Afaik we cannot do dynamic updates to it. So even service discovery in this ctxt will not work.

Here are the things that need to happen in my view:

  1. Identify the dependency -> this can be a resource in the package
  2. We need to know where the resource exists -> right now this seems to be in a package (which would be identified by the dependency (so would be good that the name of the package gets known so it is easier to search). Going fwd this could also be in the mgmt cluster or a service discovery centrally. The resource backend does this e.g. today for IP/VLAN and TOPOLOGY Also important is to know the completeness and validate whether we have all the dependent resources, could be based on package name.
  3. So eventually a NFDeployment resource need to be created with the references to these resources. We have a configuration item in NFDeployment for this.

    ConfigRefs []corev1.ObjectReference `json:"configRefs,omitempty" yaml:"configRefs,omitempty"`

By referencing the dependent resources here we can tell the SMF operator exactly what to look for. Now one of the challenges is if we have both UPF and SMF deployed on the same cluster. The resource comes from 2 packages. We could wrap it in another resource potentially to avoid dual actuation by configsync.

Alternative we could annotate the Dependent resources but the SMF operator will not know if they are all there. So I believe the explicit reference is a better approach

Given the above we need to bring this together in the specialization flow. Here is the proposal to do this.

  1. the original package contains the dependent resource and the specialised ownerRef is set to NF deployment

example: how we do this for interface. We have a reference to the final SMFDeployment that is the metaObject we need to apply to the cluster.

apiVersion: req.nephio.org/v1alpha1 kind: Interface metadata: name: n4 annotations: config.kubernetes.io/local-config: "true" specializer.nephio.org/owner: workload.nephio.org/v1alpha1.SMFDeployment.upf-empty-empty

the cond SDK ensures the conditions are set, by have the sms-deploy fn run first in the pipeline

The dependency specialised, act as we describe above and once ready completes its job. it would act the same way as the VLAN/IPAM specialiser but here it injects new resources from outside. Once complete the cond sdk set the condition to True

If the SMFDeploy fn conditions are all true it will aggregate it in the SMFdeployment CR which will be actuated on the cluster

henderiw commented 1 year ago

I would say the approach I describe above is basically service discovery.

I believe the difference is distributed or central and this is a choice you have

s3wong commented 1 year ago

Ran some test on a "more common" scenario:

  1. nftopology has an instance of SMF and an instance of UPF
  2. there are two clusters matching the UPF NFInstance clusterlabels, thus the expectation is there will be one instance of UPF for each cluster (i.e., total of two for the SMF)
  3. when the first cluster showed up, the first UPF instance is deployed (I am still just simulating these)
  4. from nftopology reconciliation loop, the task (for this instance of nftopology) is done (i.e., I only have one instance of UPF for this Topology, at this moment)
  5. sometimes later, another cluster matches this label, and Nephio deploys another UPF onto the new cluster
  6. nftopology controller doesn't detect this change, and therefore info isn't passed onto specializer on SMF package update

so it seems like this is more than holding off SMF until all UPFs are deployed; a new packagerevision matching the same NFTopology name in label and also connecting to the same NF (SMF in this case) will need to update the dependent packages

s3wong commented 1 year ago

@henderiw

I really wonder how this works:

currently, in e2e tests (https://github.com/nephio-project/test-infra/pull/63/files#diff-11333ec2ac174948e748483206666f9217ba389984073b4d467bdc66664cce45), UPF is set up via PVS where the objectSelector is key'ed off of WorkloadCluster label --- so package is only cloned when a cluster matches that scenario. What that means is it is possible (however remotely) that the SMF would be deployed before we even have a single UPF package created, and as such there is nothing in the system that would even know if this SMF package has any dependency, so the "identify what you need" part may not be known at time of the SM package deployment.

My take (as I wrote above) is that the only logical way to deal with this is to do:

  1. if some UPF packages are already deployed (or just created even), the package for the SMF that is connected to these UPF can include reference to them
  2. for those UPF instances (packages) come AFTER the SMF package is deployed, then as each of them is created / deployed, the SMF package will be updated, and as such the SMF instance will be reloaded with new configmap
henderiw commented 1 year ago

good point. the lime of thinking was like this. Someone schedules this deployment in harmony. Let's call this the 'UBER' package which contains PVS for UPF and SMF. The person deploying this would apply this UBER package to the cluster.

So this is somehow the link. As you said both of these PVS will result in PVC, etc. So they will specialise.

My assumption is none of this gets deployed unless a human approve this. Now with the auto-approval controller, what we could do is tie this back to the original package and only approve once all the conditions of each individual packages get their conditions to true.

So I see this as a bundled approval.

henderiw commented 1 year ago

@s3wong the other thing you should be aware is specialisation is not 1 shot. it continuously runs.

henderiw commented 1 year ago

here is the proposal for the reference structure.

apiVersion: ref.nephio.org/v1alpha1 kind: Config metadata: name: upf-cluster01 namespace: default spec: gvk: apiVersion: workload.nephio.org kind: UPFDeployment config: ""

apiVersion: ref.nephio.org/v1alpha1 kind: Config metadata: name: upf-cluster01 namespace: default spec: gvk: apiVersion: workload.nephio.org kind: UPFDeployment config: ""

apiVersion: workload.nephio.org/v1alpha1 kind: SMFDeployment metadata: name: smf-region spec: configRefs:

Here is a proposal on how to add the reference to the refrence SMF deployment has a list of configuration references that will be applied to the workload cluster

The SMF operator should get these reference at the beginning of the reconcile cycle. When all refs are not all there the reconciliation should retyr One all reference are found we need to parse the refs. The CRD is a generic CRD where the gvk specifies the type

Another alternative is using a configmap iso the Config.ref.nephio.org/v1alpha1 object

johnbelamaric commented 1 year ago

@s3wong the other thing you should be aware is specialisation is not 1 shot. it continuously runs.

Yes, as an "eventually consistent" system we should expect continuous change and reconciliation. So I think it's ok if it reconfigures a few times.

For the "first" time, we auto-approve. After that it requires human approval. We can delay the first approval for an arbitrary amount of time - say 5 or 10 minutes.

If we want to have more sophisticated auto-approval for future updates, we can consider that in later releases.

s3wong commented 1 year ago

@henderiw

config: ""

I am assuming this is a json string from the actual UPFDeployment.Spec? The SMF controller would simply json.Unmarshal this string?

gvbalaji commented 1 year ago

Stephen will do a PR today and planing for integration tomorrow.

johnbelamaric commented 1 year ago

This is done and working