vm-node has two long running kernel threads: one for running FSM and one for communication with dispatch. The shared resource under the issue is "trace" folder, which is not protected by lock. Most of the time, their access to the shared resource are sequentialized, as a result of predefined workflow.
However, when the FSM-thread needs to reset the FSM (e.g. as a result of VM failure), it removes the shared resource ("trace" folder) without checking with the communication-thread. If the communication-thread is accessing the shared resource at the same time, such as transmitting the "trace" to dispatch, there will be an exception occur which will cause vm-node to crash.
vm-node has two long running kernel threads: one for running FSM and one for communication with dispatch. The shared resource under the issue is "trace" folder, which is not protected by lock. Most of the time, their access to the shared resource are sequentialized, as a result of predefined workflow.
However, when the FSM-thread needs to reset the FSM (e.g. as a result of VM failure), it removes the shared resource ("trace" folder) without checking with the communication-thread. If the communication-thread is accessing the shared resource at the same time, such as transmitting the "trace" to dispatch, there will be an exception occur which will cause vm-node to crash.