Closed akutz closed 1 month ago
@dougm I am going to add a flake allowance to the tests that depend on vC Sim. See https://github.com/vmware-tanzu/vm-operator/actions/runs/11206817769/job/31148110424?pr=733#step:5:783 -- there's a race that pops up on occasion.
@dougm I am going to add a flake allowance to the tests that depend on vC Sim. See https://github.com/vmware-tanzu/vm-operator/actions/runs/11206817769/job/31148110424?pr=733#step:5:783 -- there's a race that pops up on occasion.
I've not reproduced myself yet, but looks like this should fix: https://github.com/vmware/govmomi/pull/3584
@dougm I am going to add a flake allowance to the tests that depend on vC Sim. See https://github.com/vmware-tanzu/vm-operator/actions/runs/11206817769/job/31148110424?pr=733#step:5:783 -- there's a race that pops up on occasion.
I've not reproduced myself yet, but looks like this should fix: vmware/govmomi#3584
Thanks @dougm , it did in fact fix it. I already rebased this PR after pulling your patch. Thanks again!
Package | Line Rate | Health |
---|---|---|
github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/clustercontentlibraryitem | 82% | ➖ |
github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/contentlibraryitem | 85% | ➖ |
github.com/vmware-tanzu/vm-operator/controllers/contentlibrary/utils | 97% | ✔ |
github.com/vmware-tanzu/vm-operator/controllers/infra/capability | 86% | ➖ |
github.com/vmware-tanzu/vm-operator/controllers/infra/configmap | 71% | ❌ |
github.com/vmware-tanzu/vm-operator/controllers/infra/node | 77% | ❌ |
github.com/vmware-tanzu/vm-operator/controllers/infra/secret | 77% | ❌ |
github.com/vmware-tanzu/vm-operator/controllers/infra/validatingwebhookconfiguration | 85% | ➖ |
github.com/vmware-tanzu/vm-operator/controllers/infra/zone | 81% | ➖ |
github.com/vmware-tanzu/vm-operator/controllers/storageclass | 94% | ✔ |
github.com/vmware-tanzu/vm-operator/controllers/storagepolicyquota | 97% | ✔ |
github.com/vmware-tanzu/vm-operator/controllers/util/encoding | 73% | ❌ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/storagepolicyusage | 99% | ✔ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/virtualmachine | 78% | ❌ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachine/volume | 87% | ➖ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineclass | 75% | ❌ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinepublishrequest | 81% | ➖ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinereplicaset | 68% | ❌ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineservice | 82% | ➖ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachineservice/providers | 92% | ✔ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinesetresourcepolicy | 80% | ➖ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha1 | 72% | ❌ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha1/conditions | 88% | ➖ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha1/patch | 78% | ❌ |
github.com/vmware-tanzu/vm-operator/controllers/virtualmachinewebconsolerequest/v1alpha2 | 73% | ❌ |
github.com/vmware-tanzu/vm-operator/pkg/bitmask | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/builder | 95% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/conditions | 88% | ➖ |
github.com/vmware-tanzu/vm-operator/pkg/config | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/config/capabilities | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/config/env | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/context/generic | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/context/operation | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/patch | 78% | ❌ |
github.com/vmware-tanzu/vm-operator/pkg/prober | 91% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/prober/probe | 90% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/prober/worker | 77% | ❌ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere | 75% | ❌ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/client | 80% | ➖ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/clustermodules | 71% | ❌ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/config | 89% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/contentlibrary | 74% | ❌ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/credentials | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/network | 80% | ➖ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/placement | 77% | ❌ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/session | 71% | ❌ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/sysprep | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/vcenter | 82% | ➖ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/virtualmachine | 83% | ➖ |
github.com/vmware-tanzu/vm-operator/pkg/providers/vsphere/vmlifecycle | 67% | ❌ |
github.com/vmware-tanzu/vm-operator/pkg/record | 78% | ❌ |
github.com/vmware-tanzu/vm-operator/pkg/topology | 91% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/util | 87% | ➖ |
github.com/vmware-tanzu/vm-operator/pkg/util/annotations | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/util/cloudinit | 89% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/util/cloudinit/validate | 91% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/util/image | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/util/kube | 84% | ➖ |
github.com/vmware-tanzu/vm-operator/pkg/util/kube/cource | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/util/kube/internal | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/util/kube/spq | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/util/paused | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/util/ptr | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/util/resize | 97% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/util/vmopv1 | 91% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/client | 64% | ❌ |
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/vm | 79% | ❌ |
github.com/vmware-tanzu/vm-operator/pkg/util/vsphere/watcher | 85% | ➖ |
github.com/vmware-tanzu/vm-operator/pkg/vmconfig | 95% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/vmconfig/crypto | 98% | ✔ |
github.com/vmware-tanzu/vm-operator/pkg/webconsolevalidation | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/services/vm-watcher | 91% | ✔ |
github.com/vmware-tanzu/vm-operator/webhooks/common | 100% | ✔ |
github.com/vmware-tanzu/vm-operator/webhooks/persistentvolumeclaim/validation | 95% | ✔ |
github.com/vmware-tanzu/vm-operator/webhooks/unifiedstoragequota/validation | 92% | ✔ |
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachine/mutation | 87% | ➖ |
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachine/validation | 95% | ✔ |
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineclass/mutation | 62% | ❌ |
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineclass/validation | 89% | ➖ |
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinepublishrequest/validation | 92% | ✔ |
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinereplicaset/validation | 90% | ✔ |
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineservice/mutation | 67% | ❌ |
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachineservice/validation | 92% | ✔ |
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinesetresourcepolicy/validation | 89% | ✔ |
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinewebconsolerequest/v1alpha1/validation | 92% | ✔ |
github.com/vmware-tanzu/vm-operator/webhooks/virtualmachinewebconsolerequest/v1alpha2/validation | 92% | ✔ |
Summary | 83% (10216 / 12300) | ➖ |
Minimum allowed line rate is 79%
What does this PR do, and why is it needed?
This patch adds support for reconciling VMs when their state has changed on the underlying platform, ex. vSphere. There are three primary components:
Please note, the environment variable
ASYNC_SIGNAL_DISABLED
may be set to a truth-y string value, ex. "true", to completely disable the async signal logic, regardless of the feature state switch.Watcher
The watcher is located in
pkg/util/vsphere/watcher
and watches one or more vSphere entities that can contain VMs, ex. aFolder
,ClusterComputeResource
,HostSystem
, etc.. The watcher is initialized with a set of these entities and creates aContainerView
for each. These are added to the watcher'sListView
, which enables entities to be added/removed later while the watcher is running. The watcher is signaled when a VM enters the view of the watcher or when a VM has a change to one of the on the following properties:config.extraConfig
-- Signal when the guest changes something inguestinfo
.guest.ipStack
-- Signal on changes to the network state.guest.net
-- Signal on changes to the network state.summary.config.name
-- A way to detect when the VM enters the scope of the watch.summary.guest
-- Signal on any changes to the guest.summary.overallStatus
-- Signal on changes to the VM's status.summary.runtime.host
-- Signal when the host on which a VM is running has been changed.summary.runtime.powerState
-- Signal when the VM's power state is changed.summary.storage.timestamp
-- Signal when the VM's storage information (capacity, used, etc.) has changed.Not all changes to extraConfig cause the watcher to emit a result. The following extraConfig keys are ignored:
Additional keys may be ignored as well, but these are ignored by default in order to prevent an infinite loop:
When the watcher notices a VM enter its view or with a change, the watcher must get the namespace and name for the VM. This happens one of three ways:
status.uniqueID
is now an indexed field.config.extraConfig["vmservice.namespacedName"]
from the vSphere server.If the namespace and name can be determined, the watcher checks if the VM already exists in Kubernetes with a
status.uniqueID
field and if the update type was the VM entering the view of the watcher. If these conditions are met, no result is emitted for this VM. This prevents double-reconciling VMs when the Controller-Manager starts up for the first time. During start-up, the Controller-Manager automatically reconciles all objects watched by controllers. Since all VMs would also be entering the view of watchers, this would cause a large-scale double-reconcile. Therefore, this logic skips emitting results on startup for VMs that are already deployed.If the namespace and name are non-empty, the update types was
Enter
and the VM has an emptystatus.uniqueID
field or the update type wasModify
, the watcher emits a result on a channel watched by the next component, the service.Service
The service is located in
services/vm-watcher
and is responsible for:The service will always start a new instance of the watcher as long as the reason the previous instance failed was due to a login/auth error. This is to handle the case of credential rotation.
The service starts a watcher with an initial set of entities to watch that includes the ManagedObject ID for each
Folder
that can contain VM Service VMs. These folder IDs are gathered by listing allZone
resources on the cluster and collecting the value ofspec.managedVMs.folderMoID
.The service monitors results from the watcher. Upon receiving a result, the service determines if the reported VM is valid, and if so, a reconcile request is enqueued.
Zone controller
The zone controller is located in
controllers/infra/zone
and reconcilestopologyv1.Zone
resources.When a zone resource without a deletion timestamp is reconciled, the controller adds a finalizer to it and adds the zone's vm service folder to the list of the entities monitored by the watcher.
When a zone resource with a non-zero deletion timestamp is reconciled, the controller removes the zone's vm service folder from the list of the entities monitored by the watcher and removes the finalizer.
Which issue(s) is/are addressed by this PR? (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes
NA
Are there any special notes for your reviewer:
summary.storage.timestamp
as this may result in too many reconciles. I will look into how often the storage summary is updated.Please add a release note if necessary: