npm / lockfile

A very polite lock file utility, which endeavors to not litter, and to wait patiently for others.
ISC License
260 stars 33 forks source link

Sync stat and unlock in lock() #1

Closed kschzt closed 11 years ago

kschzt commented 11 years ago

This fixes an issue in npm where tons of packages were trying to install a particular dependency. It removes a race condition in lock() that would remove the underlying lock based on stale stat() information.

kschzt commented 11 years ago

credits to @sakari for diagnosing the underlying issue ;)

isaacs commented 11 years ago

So, there was indeed a bug here, but I think it could be fixed more simply than by using a lock object and LockFile class.

The first problem is that the wait param was simply not being handled right. Like at all. It was calculating the newWait value incorrectly on retries, and for some reason, was only retrying once and giving up (instead of continuing to retry until the timeout had been reached, as one would expect it to). I don't know what I was smoking when I wrote that code, but it is just idiotic. I humbly apologize.

This exacerbated another minor problem with lock contention: the unlock function was calling fs.unlink before calling fs.close and deleting the fd out of the locks store. As a result, the watcher would get the 'rename' event, in some cases before the fd had been closed, causing it to open it again and stash the fd in the locks store. Then, the close would finish, and it would delete the (new) fd out of the store, and leak fds in cases of high contention. (This wasn't causing your problem, but it was definitely an issue.)

Can you try with the code on master, and see if it resolves your problem? If it turns out that we need to have private lock objects, then that might be fine, but I'd like to fix the underlying logical problems first, before adding new features. (Especially since your code will suffer from the same root problems, albeit less so since private lock objects makes it harder to bump into.)

kschzt commented 11 years ago

So I moved the LockFile object to npm side, rebased from master, and found the above race condition when lock contention is high. Someone could remove the underlying lock in lock() based on stale stat() information. Updated the PR.

isaacs commented 11 years ago

Doing synchronous fs io in an allegedly async function is not allowed.

Please provide a failing test, and let's go from there.