maxmind / geoipupdate-legacy

GeoIP update client code
GNU General Public License v2.0
258 stars 63 forks source link

.geoipupdate.lock not released after successful exit #79

Closed blimmer closed 7 years ago

blimmer commented 7 years ago

I noticed this in production, and it seem like this behavior doesn't match the intended behavior of releasing the lockfile after geoipupdate finishes.

> docker run -i -t ubuntu /bin/bash
root@dd7e263d706c:/# apt-get update && apt-get install aptitude software-properties-common python-software-properties && add-apt-repository ppa:maxmind/ppa && aptitude update && aptitude install geoipupdate
root@dd7e263d706c:/# geoipupdate
root@dd7e263d706c:/# echo $?
0
root@dd7e263d706c:/# ls -al /usr/share/GeoIP/
total 53344
drwxr-xr-x 2 root root     4096 Nov  1 19:53 .
drwxr-xr-x 1 root root     4096 Nov  1 19:53 ..
-rw------- 1 root root        0 Nov  1 19:53 .geoipupdate.lock
-rw-r--r-- 1 root root 51469823 Nov  1 19:53 GeoLite2-City.mmdb
-rw-r--r-- 1 root root  3143664 Nov  1 19:53 GeoLite2-Country.mmdb

Based on the PR that introduced this behavior (https://github.com/maxmind/geoipupdate/pull/55), it seems that the lockfile should be automatically cleaned up upon a successful exit. That is not the behavior I'm seeing.

horgh commented 7 years ago

Hi @blimmer. Thanks for raising this! The lock file persisting is actually expected. The existence of the file is not indicative of a lock being held. We open it and use fcntl(2) on the fd to lock. So it must exist, but the second step must also succeed.

Unfortunately, removing it would allow a race where two instances could run at once. For example, if we have an instance waiting on the lock, then we delete the file and release the lock, the waiting instance would think it holds the lock (which it does, on a deleted file). Another instance could come along and create the file again and acquire a lock on the new one, and run concurrently. There's a similar explanation here.

I agree it is not obvious and not ideal. I believe there are ways to make deleting it possible, such as described here if we really wanted to, but I'm not sure it's necessary given the additional overhead. I'd be inclined to document it rather than anything else.

blimmer commented 7 years ago

@horgh thank you for the quick response and the helpful links. I agree with you that this confusion could be solved via documentation - I looked for some before opening this issue. I opened #80 with a proposed change to the docs.