ocurity / dracon

Security scanning & static analysis tool - forked and rewritten from @thought-machine/dracon
https://ocurity.com
Apache License 2.0
61 stars 9 forks source link

Fix migrations command occasional exit without clear success signals #153

Open ptzianos opened 2 months ago

ptzianos commented 2 months ago

There is a subtle bug in the migrations apply command where the code that watches pods will stream the logs of a failed pod while a new pod executes and successfully finishes its job. The pod log watcher will pick up on the successful pod and return with no error however, it will not show the successful logs in the stdout.

This needs to be fixed to prevent this bug and since we are on it we should separate the code into a standalone utility since it will most likely come in handy in other places and figure out a way to properly test it. It's not a trivial piece of code to test since it involves a bunch of moving parts but it's going to be worth it.

ptzianos commented 2 months ago

While investigating this a couple of other bugs have become apparent. When the process is receiving a SIGTERM a signal is sent using the signal channel which unblocks the signal handler which cancels the context. The leadership lost event handler is then invoked which closes the signal channel. Throughout this process, there is no clear idea of which event was initiated first, was it the leader election lost handler or the signal handler that got invoked first? Because it can also be the case that the leader election lost signal handler can be invoked first which causes it to close the signal channel which unblocks the signal handler which cancels the context.

We should have a clear way of knowing which event was the first one that was triggered. We should also have a way of differentiating whether or the leader election was lost because of a timeout or because the normal execution loop finished correctly and then exited which will also cause the lease to be abandoned which again causes the election leader hook to be invoked. (event based concurrent systems are so fun! /s)