Track resource creation by PID

dmik commented 8 years ago

The initial implementation of pread/pwrite has a problem: the mutex for atomic operation is created upon first usage of a file by either these functions or fcntl. But it's possible that the process that created it terminates before some other related process (its child or parent) calls pread/pwrite on the same file for its own needs. However, since the process that created the mutex is already gone, the mutex gets destroyed and the second process won't be able to open it for operation — it will only see the invalid mutex handle in the shared variable.

A solution here is to zero the shared variable when the mutex gets destroyed so that another process will notice that it's null and re-create it again. However, there may be other processes that already opened the mutex (and therefore increased its internal usage counter) so it's not possible to tell which DosCloseMutex call actually destroys the mutex. Our own usage counter will also not help since if you only have the usage counter you don't know if you should decrease it on process termination because you don't know if the given process ever opened the given mutex. In order to do so, we need to track PIDs of all processes creating/opening the shared mutex and only zero the variable it when these processes are gone.

dmik commented 8 years ago

One of the real life cases when this problem strikes in is the tdbtorture tool from Samba — its parent process manipulates the TDB file after all children (that initiated the pwrite mutex creation) have ended. And fails with an assertion.

In the above commit I temporarily solved this problem by simply re-creating the mutex when DosOpenMutexSem fails with ERROR_INVALID_HANDLE as well (so that tdbtorture now works fine). However this is an improper solution because it is possible that the closed mutex handle will be reused by another process so we will end up with using a wrong mutex. Another problem which is not accounted for by this fix is race between two threads of the same process trying to re-create the mutex.

All these problems will be gone once we we do proper tracking by PID but this requires some more work and some more functions (to manipulate lists of PIDs) need to be moved from fcntl.c to shared code (yes, maintaining kernel functionality in the user land from scratch is a complex task). For the time being, the above fix should be enough to check if making pread/pwrite fixes the remaining Samba issues (http://trac.netlabs.org/samba/ticket/266) in the first place.

dmik commented 7 years ago

This problem was solved using another approach, see #43. Now the mutex is simply deleted along with the LIBCx global file description structure (SharedFileDesc) where it's stored when this structure is no more in use (i.e. when all LIBC file descriptors referring to it via LIBC calls overridden in LIBCx, including pread()/pwrite(), are closed with close()).

bitwiseworks / libcx

Track resource creation by PID #7