rpodgorny / unionfs-fuse

union filesystem using fuse
Other
309 stars 76 forks source link

excessive directory copies #25

Open rpodgorny opened 9 years ago

rpodgorny commented 9 years ago

from ancient email (maybe already fixed):

Hi,

Unionfs-fuse recursively copies the contents of a directory in an RO branch on a rename, which makes sense.  It also does the recursive copy on chmod, chown, or utimens, which seems excessive.  All that's needed for those should be a copy of the directory entry in the RW branch, not all of its children.  The fix below (maybe not the most elegant, but minimal code changes) avoids the excessive copies by only initiating a recursive copy on rename.

Cheers,
B Howe

diff --git a/src/cow.c b/src/cow.c
index 293cb65..b401696 100644
--- a/src/cow.c
+++ b/src/cow.c
@@ -130,7 +130,7 @@ int path_create_cutlast(const char *path, int nbranch_ro, int nbranch_rw) {
 /**
  * initiate the cow-copy action
  */
-int cow_cp(const char *path, int branch_ro, int branch_rw) {
+int cow_cp(const char *path, int branch_ro, int branch_rw, bool copy_dir) {
        DBG_IN();

        // create the path to the file
@@ -163,7 +163,11 @@ int cow_cp(const char *path, int branch_ro, int branch_rw) {
                        res = copy_link(&cow);
                        break;
                case S_IFDIR:
-                       res = copy_directory(path, branch_ro, branch_rw);
+                       if (copy_dir) {
+                               res = copy_directory(path, branch_ro, branch_rw);
+                       } else {
+                               res = path_create(path, branch_ro, branch_rw);
+                       }
                        break;
                case S_IFBLK:
                case S_IFCHR:
@@ -210,7 +214,7 @@ int copy_directory(const char *path, int branch_ro, int branch_rw) {
                        res = 1;
                        break;
                }
-               res = cow_cp(member, branch_ro, branch_rw);
+               res = cow_cp(member, branch_ro, branch_rw, true);
                if (res != 0) break;
        }

diff --git a/src/cow.h b/src/cow.h
index 7410ff2..04d6dfb 100644
--- a/src/cow.h
+++ b/src/cow.h
@@ -9,7 +9,7 @@

 #include <sys/stat.h>

-int cow_cp(const char *path, int branch_ro, int branch_rw);
+int cow_cp(const char *path, int branch_ro, int branch_rw, bool copy_dir);
 int path_create(const char *path, int nbranch_ro, int nbranch_rw);
 int path_create_cutlast(const char *path, int nbranch_ro, int nbranch_rw);
 int copy_directory(const char *path, int branch_ro, int branch_rw);
diff --git a/src/findbranch.c b/src/findbranch.c
index 16943c4..9011abc 100644
--- a/src/findbranch.c
+++ b/src/findbranch.c
@@ -160,7 +160,10 @@ out:
  */
 int find_rw_branch_cow(const char *path) {
        DBG_IN();
+       return find_rw_branch_cow_common(path, false);
+}

+int find_rw_branch_cow_common(const char* path, bool copy_dir) {
        int branch_rorw = find_rorw_branch(path);

        // not found anywhere
@@ -182,7 +185,7 @@ int find_rw_branch_cow(const char *path) {
                return -1;
        }

-       if (cow_cp(path, branch_rorw, branch_rw)) return -1;
+       if (cow_cp(path, branch_rorw, branch_rw, copy_dir)) return -1;

        // remove a file that might hide the copied file
        remove_hidden(path, branch_rw);
diff --git a/src/findbranch.h b/src/findbranch.h
index 6378d86..f884b9c 100644
--- a/src/findbranch.h
+++ b/src/findbranch.h
@@ -16,5 +16,6 @@ int find_rorw_branch(const char *path);
 int find_lowest_rw_branch(int branch_ro);
 int find_rw_branch_cutlast(const char *path);
 int find_rw_branch_cow(const char *path);
+int find_rw_branch_cow_common(const char *path, bool copy_dir);

 #endif
diff --git a/src/unionfs.c b/src/unionfs.c
index 0d08721..b175a22 100644
--- a/src/unionfs.c
+++ b/src/unionfs.c
@@ -430,7 +430,7 @@ static int unionfs_rename(const char *from, const char *to) {
        if (i == -1) return -errno;

        if (!uopt.branches[i].rw) {
-               i = find_rw_branch_cow(from);
+               i = find_rw_branch_cow_common(from, true);
                if (i == -1) return -errno;
        }

and


D'oh, definitely a bug from my side. A recursive dir-copy was never
intended for anything except of rename. Thanks for reporting it and
writing the patch!

It looks good on a first glance. Unless Radek already adds it to his
tree, I will do during the weekend (I really I hope I find some time).

Cheers,
Bernd
carter-thompson commented 9 years ago

This bug is still present in the current release (v1.0). Are there any plans to implement the fix in the near future? If not, could you tell me when the diff above is dated so I can match it up with the correct version?

rpodgorny commented 9 years ago

hmmm, i have to admit i've dumped the ancient mail and completely forgot about it. unfortunately, i'll be quite busy during the next month so i can't make any promises.

if you're considering taking a look at this meanwhile, i'll be more than happy. just be sure to start with adding a test to the test suite (test.py) or at least make sure the rename test is already there.

anyway, i've deleted the mail already so i can't date it but looking at the patch, it shouldn't be hard to work it in into the current source (maybe will even be conflict-free).

evnu commented 9 years ago

Does not apply cleanly anymore with the current master (0b35c647a60d22283a04583ac1a29c7d250d9184):

$ git apply fix-excessive-copies.patch
error: patch failed: src/cow.c:130
error: src/cow.c: patch does not apply
error: patch failed: src/findbranch.c:160
error: src/findbranch.c: patch does not apply
error: patch failed: src/findbranch.h:16
error: src/findbranch.h: patch does not apply
error: patch failed: src/unionfs.c:430
error: src/unionfs.c: patch does not apply
rpodgorny commented 9 years ago

so, i've ported to the patch to current version in https://github.com/rpodgorny/unionfs-fuse/tree/issue-25

still, i'm not going to merge it until there's a test proving this works (i've created some basic rename tests in master - feel free to extend).