Closed gainsley closed 2 months ago
Hi Lev, I found a bunch of issues after more testing. In particular, I didn't realize the proxyCerts object was called by the infra-specific code, so I needed to split it into a stateless and cloudlet-specific part, and a persistent and cloudlet-independent part (cache).
Hi Lev, I found a bunch of issues after more testing. In particular, I didn't realize the proxyCerts object was called by the infra-specific code, so I needed to split it into a stateless and cloudlet-specific part, and a persistent and cloudlet-independent part (cache).
Sounds good. I'll still skim through the changes to build a better picture in my head, but won't focus on specifics. Will do a more in-depth review after your changes.
Hey Lev, I already pushed the fixes, so the PR is complete.
Thanks Lev also for taking on that code review! So I have done a wide range of tests, besides the usual unit and e2e tests, I also set it up (you can see this in the director changes) to run make test-start-dns
, which let me create a cloudlet on Openstack from a CRM started via e2e tests. I also tested a k3s deployment with the operator changes on Openstack (both those openstack tests used the acceptance tests to test). But yeah, I only tested openstack with CrmOnEdge=true. All the CrmOnEdge=false testing is via the fake platform in unit/e2e tests.
I think that when we implement the OSM platform with CrmOnEdge=false we'll have a chance to find any other bugs lurking there.
Previously, our architecture required a CRM service running on edge-site to convert platform-independent APIs to platform-specific API calls to deploy VMs/Clusters/AppInsts/etc. To be able to support platforms where it is not feasible to run the CRM service on edge-site (typically because there is another layer of software managing the infrastructure), we want to be able to run the platform-specific code from the CCRM service, which runs off-edge-site alongside the Controller.
This PR refactors the common CRM code (from
pkg/crmutil/controller-data.go
), moving the parts that are specific to a single CRM on-edge intopkg/crm
. The remaining code is modified to support being included in either the CRM, which is a single-instance process using notify for communication with the Controller, or the CCRM which is a horizontally scaled set of processes that use direct access to etcd and GRPC to communicate with the Controller. Code inpkg/crmutil
is meant to be shared between both CRM and CCRM.In the common
pkg/crmutil/controller-data.go
code, there was a lot of functionality that was specific to a single-instance process running over notify. The following changes were made:pkg/controller/trustpolicyexception_changes.go
.The platform code also requires changes to be able to run from CCRM. This was only partially completed.
pkg/crm/crm.go
. For the CCRM, the Controller has a periodic thread that runs to trigger CCRMs to refresh certs inpkg/controller/ccrm_periodic.go
.Other changes:
api/edgeproto/alert.pb.go
.I still need to run tests against a real infra like Openstack, and maybe take care of the AppInstRuntime TODO.