Closed agrare closed 4 years ago
Issues still to be worked out:
@agrare As discussed offline, we should treat any realtime watcher the way we do events or the vmware watcher, and that is that there should probably be 2 threads...one that watches and puts the raw data on an internal in-memory queue, and a second thread that reads from that queue and writes to the database (probably to MiqQueue). With an internal queue you also have the advantage of batching things up, so you could write to the MiqQueue every 5 seconds instead and send the entirety of what was seen in a 5 second period. This would prevent the harsh one-by-one slamming of the MiqQueue.
MVP for this has been merged.
Still needs to be completed:
ManagerRefresh::Target
payload in BinaryBlob
table /cc @Ladas
Problem
Currently only full refresh is supported for container providers (Kubernetes/Openshift), with sufficiently large environments this refresh can take over 2 hours. This is long enough that pods/containers can be created and deleted while a refresh is running causing them to be completely missed by ManageIQ.
Without a record of all pods which were created policy actions cannot be run and metrics cannot be collected for chargeback.
Proposed Solution
Kubernetes supports a stream update mechanism /watch which delivers changes to a registered client. There is an example in the kubeclient repo: https://github.com/abonas/kubeclient#receive-entity-updates
We propose adding a new worker (InventoryCollectorWorker) which registers for these WatchStreams specifically for pods and sends ManagerRefresh::Target targets with the payload to the RefreshWorker for parsing and saving. Since all updates are persisted in the queue and will be handled by the refresh worker no pod will be missed.
In addition to maintaining a record of all pods which were created&deleted we can collect metrics on recently disconnected pods ensuring we have metrics for these short lived containers.
PRs
[x] ManageIQ/manageiq#16198 - This adds the base worker class for an InventoryCollectorWorker
[x] ManageIQ/manageiq-providers-kubernetes#129 - This contains the worker mixin which actually subscribes to the watches and sends the targets
[x] ManageIQ/manageiq-providers-openshift#52 - Just adds the Openshift worker based on the Kubernetes mixin
[x] ManageIQ/manageiq-providers-kubernetes#135 - Adds support for targeted pod refresh
[x] ManageIQ/manageiq-providers-openshift#54 - Openshift equivalent
[x] https://github.com/ManageIQ/manageiq/pull/16311 - Add worker classes for kubernetes and openshift
cc @Fryguy @Ladas @kbrock @simon3z
Moved from: https://github.com/ManageIQ/manageiq/issues/16240