bbangert / beaker

WSGI middleware for sessions and caching
https://beaker.readthedocs.org/
Other
517 stars 146 forks source link

Fix race condition in FileNamespaceManager.do_open #148

Closed lnewson closed 6 years ago

lnewson commented 6 years ago

Updated the FileNamespaceManager do_open function to handle race conditions when opening files and so that if follows the EAFP recommendations in the os.access Python docs (see https://docs.python.org/3/library/os.html#os.access).

One way that I've seen this happen (though the timing is hard to get right), is that if you have a multiple process environment and one process calls Cache.clear() which in turn calls namespace.remove() and therefore deletes the cache file. At the same time, the other process is attempting to open the file and passes the file_exists check but then before getting to open the file namespace.remove() gets called and the do_open call then ends up throwing the following:

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/cache/data/container_file/3/3f/3f1492a2cca7a86c007a6887af2e966b78bf40c4.cache'

So this race condition can be caused by other processes changing the permissions or deleting the cache file between the file_exists check and actually trying to open and read the file contents. This can also happen when working with remote file systems (ie NFS) as the service could cache the file attributes and therefore pass the os.access call, however then fail because some other host has deleted the cache file (ie by it calling Cache.clear()).

lnewson commented 6 years ago

Hmm, I cannot see why the tests.test_cache.test_legacy_cache test is failing for Python 3.4 only sorry... as far as I can see it shouldn't be related to this change given it's failing on a DBM assertion and isn't failing locally when testing with Python 3.4 using pyenv either.

If it is something with this change that I've just overlooked, let me know and I'll be happy to fix it.

amol- commented 6 years ago

The failure was surelty unrelated, as it involves time it's probably just a flaky test. This patch couldn't have changed any behaviour.