Closed RayPlante closed 4 years ago
The bug is difficult to replicate reliably which makes testing the change tricky, but I was able to accomplish this on my own platform under oar-docker for testing purposes. I am bypassing normal review to allow for testing with MIDAS on testdata.
Replicated bug on datapubtest; could not replicate after applying this fix.
In the python-based publishing code, reading and writing of NERDm metadata files are supposed be protected in a multi-threaded/multi-processing application. In particular, some of the processing triggered by mdserver/preserver web service calls that result in updating metadata files is done asynchronously--i.e. in a separate thread--so that web service calls can return quickly. Despite the file locking that's in place, we would still on occasion see file write collisions: the same metadata data would get written twice sequentially into a file. This PR fixes that bug.
The file locking that was in place was the use of python's
lockf()
function. It turns out that this function only provides protection across multiple processes. This is important for protecting the metadata and preservation services (which run in separate pr from interfering with each other); however, it does not protected against colliding accesses across threads in the same python process.The
pdr.utils
module was updated to add a newLockedFile
class that provides both multiprocess locking (provided bylockf()
) and multithread locking (provided by the lock classes available in the pythonthreading
module). It supports both shared locks--to allow unlimited simultaneous reads--and exclusive locks, ensuring only one process/thread can access a file during a write. This new locking mechanism was incoporated intoread_json()
andwrite_json()
.