containerd / nri

Node Resource Interface
Apache License 2.0
236 stars 62 forks source link

adpatation: plugin registration/sync race, unit test failures. #99

Open klihub opened 1 month ago

klihub commented 1 month ago

Adaptation unit tests fail, ever since c4893c7e31c35f1b056b5462cb135a9c15f8b8f4. The reason is that while that commit fixed a locking inversion bug, it now opened a time window for a plugin to lose events during registration after it has received the initial set of pods and containers but before it is has been fully registered for event processing.

Synchronizing the plugin and adding it to the slice of registered plugins both used to happen with Adaptation.Mutex held. Now that mutex is held only after synchronization is done while updating the slice of plugins. As a consequence to the adaptation test suite while Suite.WaitForPluginsToSync() used to return only once the plugin was fully registered (IOW, added to slice of plugins), now it returns before that. Therefore, it became possible for a pod/container event to come when a plugin is halfway through its registration: already synchronized with existing pods/containers, but not added yet to the slice of plugins. When that happens, the plugin loses any event that comes before it has been added to the slice of plugins.

The effect is not limited to unit tests, but it becomes apparent as unit tests banging full steam registering plugins and generating pod/container events now trigger this race condition reliably. Similar NRI pod/container event loss is also possible now in the runtimes.

klihub commented 1 month ago

Here is a proposed fix. It requires additional co-operation from the NRI integration code on the runtime side. The runtime should request blocking of plugin registration during event processing.

samuelkarp commented 1 month ago

after it has received the initial set of pods and containers but before it is has been fully registered for event processing.

It seems to me that the order should really be the opposite: a plugin should be fully registered for events (and start receiving immediately), and then synchronize to see any pods/containers that existed prior to subscription to the event stream. This would require some additional logic to handle deduplication (since some pods/containers may be received both event-based during the initial registration and during that sync), but would not require any locking or pausing of events or registration.