Closed blimmer closed 7 years ago
Hi @blimmer. Thanks for raising this! The lock file persisting is actually expected. The existence of the file is not indicative of a lock being held. We open it and use fcntl(2)
on the fd to lock. So it must exist, but the second step must also succeed.
Unfortunately, removing it would allow a race where two instances could run at once. For example, if we have an instance waiting on the lock, then we delete the file and release the lock, the waiting instance would think it holds the lock (which it does, on a deleted file). Another instance could come along and create the file again and acquire a lock on the new one, and run concurrently. There's a similar explanation here.
I agree it is not obvious and not ideal. I believe there are ways to make deleting it possible, such as described here if we really wanted to, but I'm not sure it's necessary given the additional overhead. I'd be inclined to document it rather than anything else.
@horgh thank you for the quick response and the helpful links. I agree with you that this confusion could be solved via documentation - I looked for some before opening this issue. I opened #80 with a proposed change to the docs.
I noticed this in production, and it seem like this behavior doesn't match the intended behavior of releasing the lockfile after
geoipupdate
finishes.Based on the PR that introduced this behavior (https://github.com/maxmind/geoipupdate/pull/55), it seems that the lockfile should be automatically cleaned up upon a successful exit. That is not the behavior I'm seeing.