hypercore-protocol / hyperdrive-daemon

Hyperdrive, batteries included.
MIT License
156 stars 24 forks source link

Rsync into mount breaks drive #44

Open sammacbeth opened 4 years ago

sammacbeth commented 4 years ago

I have a Hyperdrive with two drives mounted in the root, i.e.

hyperdrive create ~/Hyperdrive/a
hyperdrive create ~/Hyperdrive/b

I rsynced data into one of the drives from a remote machine:

rsync -azv -e ssh path/to/folder sam@server:~/Hyperdrive/b

This fails at the end when renaming files.

After this, I noticed that folder a seemed to have disappeared, though it is still being seeded.

$ ls -la ~/Hyperdrive/
total 0
drwxr-xr-x 1 sam sam 0 Apr 21 15:18 Network
drwxr-xr-x 1 sam sam 0 Apr 21 15:30 b

If you want to try to reproduce the state of b, I'm seeding it at 42862b84df038a057c77b391a3807da456b547a81bb9693b9d260191c7a23308

andrewosh commented 4 years ago

Sorry for not getting back to you on this one @sammacbeth. Gonna do another FUSE debugging push next week and this will be a priority.

Pretty sure it has to do with either renaming (which we don't have much support for yet) or file metadata, but I'll update you when I've figured it out.

da2x commented 4 years ago

rsync: rename "/home/da/Hyperdrive/my-test-repo/.index.html.k1KGaS" -> "index.html": Function not implemented (38)

rsync stores files in a temporary location before moving them into place. I tried using rsync --inplace but get this error instead:

rsync: ftruncate failed on "/home/da/Hyperdrive/my-test-repo/index.html": Invalid argument (22)

da2x commented 4 years ago

The temporary files that rsync creates in the destination directory will be added to/duplicated in the append-only file system after the rename operation.

The method that seems to create the least amount of excess working-copies of the data in the hyperdrive is rsync --inplace --whole-file. hyperfuse would need to support ftruncate, though.

@andrewosh is there any Hyperdrive documentation about this topic anywhere?

da2x commented 4 years ago

@sammacbeth here’s a work-around using a temp staging directory before applying the changes to the hyperdrive:

rsync --whole-file -T `mktemp --directory`

(--temp-dir= appears to be broken in 3.1.x so use -T.)

Unfortunately, hyperdrive fuse stops responding after rewriting a couple of files.

andrewosh commented 4 years ago

Thanks for digging into this @da2x. Hyperdrive and hyperdrive-fuse should both support ftruncate, so this must just be a bug. Can you give any more info about what you mean by "rewriting a couple of files?" Do you mean rerunning the rsync command multiple times?

hyperdrive-fuse not responding after multiple runs would indicate that one of the handlers isn't terminating correctly, which will exhaust all the available FUSE threads.

da2x commented 4 years ago

I made a directory with a 2000 files with 20 kb from /dev/rand. I then copied these over into a hypedrive. I then changed the contents of 200 of those files in the original directory and used rsync to sync the changes into the hyperdrive. It works for maybe fifty files then rsync prints rsync: ftruncate failed for each remaining file.