Open lmoureaux opened 5 years ago
* Use a `pthread` mutex in a shared memory region. This would be hard to get right for our use case.
Except the race condition at creation, robust POSIX mutexes in shared memory should do the job, no?
* Create a temporary file (eg `/tmp/memhub.lock`) and use the [advisory locking API](https://linux.die.net/man/2/fcntl). Advisory file locks are released when a process is killed. I have a working prototype.
Is there a possibility to implement a timeout for the lock acquisition? I don't like calls that can block forever... I think implementing a timeout for fnctl
possible with signals, but, as always with signals, it not the safest way.
A dying process should release all resources it holds.
While I usually agree with that principle is it the best thing to do in our case? I mean, if a process terminates unexpectedly what guarantee can we have about the hardware state? Isn't it better to forbid any access to the system?
Of course, there are many ways to deal with the issue at higher levels in the software stack, but in these cases, there must be a good cooperation between the different processes accessing the hardware.
Except the race condition at creation, robust POSIX mutexes in shared memory should do the job, no?
Setting them up seems a bit harder.
Is there a possibility to implement a timeout for the lock acquisition?
Active polling with F_SETLK
seems to be the only signal-free way.
Has this been shown to be an issue anywhere yet? Or is this as yet an academic exercise?
Is there a possibility to implement a timeout for the lock acquisition? I don't like calls that can block forever...
Indeed, the initial incantation we implemented to try to solve the bus collisions involved a lock file and a timeout. @evka85 then implemented the version with semaphores
Yeah I think the lock file approach had some issues, but I can't remember what it was. Indeed if you kill it with -9 (SIGKILL), it will leave any active semaphores hanging, though normally we should just use SIGTERM, which should exit cleanly (this was tested quite extensively)
Has this been shown to be an issue anywhere yet? Or is this as yet an academic exercise?
I'm not aware of this issue happening in production, but it's a race condition involving a rare signal (SIGKILL
) so it's rather unlikely.
normally we should just use SIGTERM
To be specific, I'm afraid of the kernel sending SIGKILL
in out-of-memory conditions.
Ok I see, yes the kill from kernel of course is possible, but that is just an indication of something else going terribly wrong, and I think we should instead the root cause, which is running out of memory.
I know that the rw_reg library currently isn't really using memory efficiently, and running multiple instances of it may result in out-of-memory condition. We have discussed possible ways to change the rw_reg to reduce the memory footprint: mainly the memory is wasted by creating duplicate nodes with shifted addresses coming from the generated nodes in the xml file. There are a couple of ways to go around that -- one way would be to use shared memory that contains the register addresses among all processes, this is already done e.g. in the RPC service by using LMDB, and so python tools could also use that; another way would be just to optimize the way that generated nodes are stored in memory by storing just the root node and the generate parameters, and uppon request for any specific generated node from the user, the address would be calculated dynamically.
We should probably open a new issue on that..
Ok I see, yes the kill from kernel of course is possible, but that is just an indication of something else going terribly wrong, and I think we should instead the root cause, which is running out of memory.
I fail to understand how out-of-memory conditions are different from SIGSEGV
(segmentation violation), SIGILL
(illegal instruction) or SIGABRT
(abort), which all leave the semaphore in a consistent state. What's the rationale for treating these errors differently?
Sorry, I didn't quite understand the question.. We are handling all the signals that we can, but in case the kernel out-of-memory killer kicks in, it just sends SIGKILL, which we can't catch. If this is really a problem in normal operation, we could try to work around it by monitoring the memory situation, and sending out some other signal before the out-of-memory killer kicks in. There may be other ways of registering for some kind of notifications from kernel, I don't know..
I think I did the signal handling mostly to clean up when the user presses ctrl+c which results in SIGINT, but I added whatever other signals I could just in case.
Brief summary of issue
memhub uses semaphores to prevent concurrent access to the CTP7 memory. It tries hard to avoid leaving active semaphores behind, but all these efforts are moot if the process gets killed. This is a caveat of semaphores themselves.
Types of issue
Expected Behavior
A dying process should release all resources it holds.
Current Behavior
When a process gets killed (
SIGKILL
) in the middle of a memhub call, it leaves behind an active semaphore. The next process trying to use memhub gets stuck.Steps to Reproduce (for bugs)
memhub_read
right after the call tomemsvc_read
memsvc_read
(eg through an RPC call)kill -9
itmemsvc_read
againPossible Solution
Since this is caused by a caveat in the semaphores API, one has to find another way to synchronize multiple processes. There are basically two ways apart from semaphores:
pthread
mutex in a shared memory region. This would be hard to get right for our use case./tmp/memhub.lock
) and use the advisory locking API. Advisory file locks are released when a process is killed. I have a working prototype.Context
Want to avoid a CTP7 getting stuck and requiring manual intervention.
Your Environment