HenrikBengtsson / R.utils

🔧 R package: R.utils (this is *not* the utils package that comes with R itself)
https://henrikbengtsson.github.io/R.utils/
62 stars 14 forks source link

renameFile() should fall back to try copy'n'delete if file.rename() fails #42

Open HenrikBengtsson opened 8 years ago

HenrikBengtsson commented 8 years ago

Add renameFile(..., methods=c("file.rename", "copy-delete")) so that renameFile() first tries with base::file.rename() and if that does not succeed it should fall back to:

  1. Try to copy file atomically, cf. copyFile().
  2. validate that the copied file is identical to the original file, cf. digest::digest().
  3. try to remove the original file.

    Background

> dfR <- renameTo(dfC, filenameR, verbose=TRUE)
Renaming GenericDataFile pathname...
 Source: ../../../../../../../tmp/henrik/RtmpjURWIt/1.2(a).txt
 Destination: ./1.2(a).txt.foo
 Renaming file...
[2016-01-04 08:08:31] Exception: Failed to rename file: ../../../../../../../tmp/henrik/RtmpjURWIt/1.2(a).txt -> ./1.2(a).txt.foo

  at #04. renameFile.default(srcPathname, pathname, ...)
          - renameFile.default() is in environment 'R.utils'

  at #03. renameFile(srcPathname, pathname, ...)
          - renameFile() is in environment 'R.utils'

  at #02. renameTo.GenericDataFile(dfC, filenameR, verbose = TRUE)
          - renameTo.GenericDataFile() is in environment 'R.filesets'

  at #01. renameTo(dfC, filenameR, verbose = TRUE)
          - renameTo() is in environment 'R.filesets'

Error: Failed to rename file: ../../../../../../../tmp/henrik/RtmpjURWIt/1.2(a).txt -> ./1.2(a).txt.foo
In addition: Warning message:
In file.rename(pathname, newPathname) :
  cannot rename file '../../../../../../../tmp/henrik/RtmpjURWIt/1.2(a).txt' to './1.2(a).txt.foo', reason 'Invalid cross-device link'
[...]

Apparently Unix mv falls back to copy'n'delete if rename fails, cf. http://stackoverflow.com/questions/24209886/invalid-cross-device-link-error-with-boost-filesystem

HenrikBengtsson commented 8 years ago

One question/concern is that file / directory renaming is sometimes used as an atomic operation. This will no longer be true with a copy'n'delete approach. How closely can we emulate an atomic operation using copy'n'delete?

HenrikBengtsson commented 8 years ago

Atomic copy'n'delete algorithm for files (not directories):

  1. Copy to temporary file; copyFile(pathname, newPathname.tmp).
  2. Rename source file pathname -> pathname.tmp
  3. Rename temporary destination file newPathname.tmp -> newPathname.
  4. Remove source file pathname.tmp.

If Step 2 fails, then remove newPathname.tmp. If Step 3 fails, then rename pathname.tmp -> pathname and then remove newPathname.tmp. If Step 4 fails, (try again? and) then give an informative error message.

A similar approach could work for directories too.

lawremi commented 8 years ago

It would be nice to have the copying logic as a patch to file.rename(). One obvious error to handle from rename() is EXDEV, where the hard link fails across devices.

HenrikBengtsson commented 8 years ago

Thanks for the feedback / comment. Having this in base R would be ideal. I've moved this over to https://github.com/HenrikBengtsson/Wishlist-for-R/issues/33 to keep it independent of the R.utils package.