loris-imageserver / loris

Loris IIIF Image Server
Other
209 stars 87 forks source link

Add a `safe_rename` function that's atomic and filesystem-aware #380

Closed alexwlchan closed 6 years ago

alexwlchan commented 7 years ago

This uses the proposed approach in #379 – first try to copy the file using os.rename(), but if we detect we’re trying to copy across a filesystem boundary and the OS gets upset, we fall back to using shutil.move() in a way that’s atomic.

Resolves #379.

/cc @giorgiosironi @bcail

alexwlchan commented 7 years ago

Why are computers hard?

Well, it takes 20 lines of code just to copy a file…

alexwlchan commented 6 years ago

rough scribblings before bed (needs comments!):

# https://stackoverflow.com/q/11614815/1558022

import errno
import glob
import os
import shutil
import uuid

def remove_f(path):
    try:
        os.remove(path)
    except OSError as err:
        if err.errno == errno.ENOENT:
            pass
        else:
            raise

if os.path.exists(dst):
    return

mole_id = uuid.uuid4()
tmp_dst = shutil.copyfile(src, '%s.%s.tmp' % (dst, mole_id)))
mole_dst = '%s.%s.mole.tmp' % (dst, mole_id)
os.rename(tmp_dst, mole_dst)

matching = sorted(glob.glob('dst.*.mole.tmp'))

if mole_dst != matching[0]:
    remove_f(mole_dst)

mole_dst = matching[0]

if os.path.exists(dst):
    remove_f(mole_dst)
    return

try:
    os.rename(mole_dst, dst)
except OSError as err:
    if err.errno == errno.ENOENT:
        pass
    else:
        raise

os.unlink(src)
jrhoads commented 6 years ago

I may be coming to the conversation late. Is there any risk of a file starting to be read in between the two atomic writes? Or is this vanishingly small?

-Joseph

On Fri, Nov 3, 2017 at 12:56 PM, bcail notifications@github.com wrote:

@bcail commented on this pull request.

In loris/utils.py https://github.com/loris-imageserver/loris/pull/380#discussion_r148838394 :

    • Moves must work across filesystems. Often temp directories and the
  • cache directories live on different filesystems. os.rename() can
  • throw errors if run across filesystems.
  • So we try os.rename(), but if we detect a cross-filesystem copy, we
  • switch to shutil.move() with some wrappers to make it atomic.
  • """
  • logger.debug('Renaming %r to %r', src, dst)
  • try:
  • os.rename(src, dst)
  • except OSError as err:
  • logger.debug('Calling os.rename(%r, %r) failed with %r', src, dst, err)
  • if err.errno == errno.EXDEV:
  • This is based on the safe, atomic file copying algorithm

  • described in https://stackoverflow.com/a/28090883/1558022.

Isn't some of this unnecessary, because we don't care if the file gets overwritten? If one process creates dst atomically, and then another process overwrites it atomically, is there any harm?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/loris-imageserver/loris/pull/380#pullrequestreview-74146809, or mute the thread https://github.com/notifications/unsubscribe-auth/ABcoaJY-PIrFpQHx-6PyrxBYHsY_vMGBks5sy0XJgaJpZM4QBPjj .

bcail commented 6 years ago

@jrhoads I don't know what the odds are of that happening. If that is an issue (a file being overwritten), I think we'll have the same issue if the directories are on the same filesystem and we can just use the atomic rename (in the first try block of safe_rename). And the issue is already happening in any Loris installs. Does anyone know if overwriting files has been an issue?

bcail commented 6 years ago

I ran some tests, and it looks like there's no issue with doing an os.rename of one file onto an identical one while it's being read.

import asyncio
import os
import shutil

async def rename_file():
    os.rename('test2.txt', 'test.txt')
    print('  *renamed test2.txt to test.txt*')

async def read_file():
    print('opening file')
    with open('test.txt', 'rt') as f:
        print('reading first 10 bytes')
        first_data = f.read(10)
        await rename_file()
        print('reading rest of file')
        rest_of_data = f.read()
    print('finished reading file - closed it')
    print(f'first_data: {first_data}')
    print(f'rest_of_data: {rest_of_data}')

if __name__ == '__main__':
    assert not os.path.exists('test2.txt')
    shutil.copyfile('test.txt', 'test2.txt')
    print('copied test.txt to test2.txt')
    event_loop = asyncio.get_event_loop()
    try:
        print('starting coroutine')
        coro = read_file()
        print('entering event loop')
        event_loop.run_until_complete(coro)
    finally:
        print('closing event loop')
        event_loop.close()
    assert not os.path.exists('test2.txt')
alexwlchan commented 6 years ago

Isn't some of this unnecessary, because we don't care if the file gets overwritten?

Yeah, that version was a bit OTT. I’ve cut it back and pushed a new version.