Race condition between sync and user commands

NCAR / ctt_server

Apache License 2.0

1 stars 0 forks source link

Race condition between sync and user commands #16

Closed Will-Shanks closed 8 months ago

Will-Shanks commented 9 months ago

When a user closes the last issue on a node, releasing it back to production , while a pbs_sync is running this can cause to confused and open a new issue. This happens when the sync gets the pbs state then a user closes an issue, changing the nodes state in ctt before the sync compares the pbs state it received with ctt's state.

Will-Shanks commented 9 months ago

The easy solution is to serialize anything that modifies pbs or ctt state. If this proves to cause too much delay on user requests when a sync is in progress a sequence lock could be used to shorten the critical section in the sync code, and have it retry if a user request happened after it got pbs state , but before it locked and compared the ctt state.

Will-Shanks commented 9 months ago

Put in a quick fix that serializes anything that might modify the ctt db or pbs by wrapping pbs_sync and the various mutation apis with a lock https://github.com/Will-Shanks/ctt_server/blob/f58892e2fbd6f8ce949283a6ab97ec5927c8588e/src/pbs_sync.rs#L22

It is probably worth it to put a little effort into shortening the critical sections, since the entirety of each of the api calls don't need to be serialized.