SVL-PSU / crete-dev

CRETE under development
Other
58 stars 15 forks source link

[vm-node] vm-node crash caused by race condition #6

Closed likebreath closed 7 years ago

likebreath commented 7 years ago

vm-node has two long running kernel threads: one for running FSM and one for communication with dispatch. The shared resource under the issue is "trace" folder, which is not protected by lock. Most of the time, their access to the shared resource are sequentialized, as a result of predefined workflow.

However, when the FSM-thread needs to reset the FSM (e.g. as a result of VM failure), it removes the shared resource ("trace" folder) without checking with the communication-thread. If the communication-thread is accessing the shared resource at the same time, such as transmitting the "trace" to dispatch, there will be an exception occur which will cause vm-node to crash.