redbo / cloudfuse

Filesystem (fuse) implemented on Mosso's Cloud Files
http://redbo.github.com/cloudfuse/
MIT License
391 stars 114 forks source link

rsync issues #18

Open halburgiss opened 13 years ago

halburgiss commented 13 years ago

I am trying to implement a backup solution from a local Linux server to cloud files. The local system has 300G in 600,000+ files. A very small percentage of files changes daily, which would be an ideal situation for rsync. Cloudfuse is used on the system being backed up, and thus has a local mount point. rsync has been very problematic, and I have yet to get the filesystem completely synced for the first time (after trying for 2 weeks). Using rsync to sync to and from clould servers on the other hand (Linux to Linux) is very reliable, so I don't think there are any network type issues.

rsync will always fail at some point. Sometimes it fails in seconds, sometimes minutes, and sometimes hours. Typically, the process just hangs with no error message. But sometimes it dies outright, with error messages. Typically, the mountpoint is fubar and cloudfuse has to be killed manually, to get back to a sane starting point.

Below is a list of various error messages.

If anyone is using rsync successfully in any similar situation, I'd appreciate knowing what options are being invoked with rsync as I cannot find a successful combination. I wonder if this is even feasible and whether I am wasting my life trying :/

Error potpourri:

1. rsync: writefd_unbuffered failed to write 4 bytes [sender]: Broken pipe (32) rsync: close failed on "/mnt/rackspace/backups/rsync/titan/raid/profiles/": No such file or directory (2) rsync: connection unexpectedly closed (1283 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(632) [sender=3.0.4]

2. rsync: recv_generator: mkdir "/mnt/rackspace/backups/rsync/titan/raid/profiles/hanna" failed: Transport endpoint is not connected (107) * Skipping any contents from this failed directory * (NOTE: This is bogus, the directory in question does indeed exist and nothing unusual about it).

  1. io timeout after 900 seconds -- exiting rsync error: timeout in data send/receive (code 30) at io.c(237) [sender=3.0.4]

  2. rsync: connection unexpectedly closed (4873 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(632) [sender=3.0.4]

  3. fuse: bad mount point `/mnt/rackspace': Transport endpoint is not connected

intel352 commented 13 years ago

As mentioned in a similar-looking issue, I've seen Transport endpoint errors myself, and it turned out it was due to attempting to transfer >5gb files.

See here for how to segment >5gb files for proper storage in cloud files: http://www.rackspace.com/knowledge_center/index.php/Does_Cloud_Files_support_large_file_transfer

intel352 commented 13 years ago

Due to cloudfuse instability and lack of support for large files, I've switched to a tool called "st" for transferring. Uploading via snet goes extremely fast, I believe it was around 40MB/s (megabyte). So, regarding cloud to cloud file transfer, no complaints here.

Jonathan Langevin

On Wed, Aug 17, 2011 at 2:02 PM, danielb2 < reply@reply.github.com>wrote:

what are your guys transfer speeds on this? I'm using rsync on a local rackspace machine to transfer files to a cloudfuse mount. It took me 1.5 minutes to transfer 200K of files. That makes it unusable to me. I didn't see any errors, so this is a bit OT, but I'm curious of what you've seen.

Reply to this email directly or view it on GitHub: https://github.com/redbo/cloudfuse/issues/18#issuecomment-1828853

danielb2 commented 13 years ago

I've just started looking a this so I'm unfamiliar with snet. Do you have a link to this "st" tool?

thanks! :)

-d

elescondite commented 13 years ago

I'll second that. I have had better luck with duplicity, but still the instability is maddening.

One of the shortcomings of Rackspace IMHO is how server size is tied to disk space. For the project I am working on at the moment, only about 5-10% of the data needs to be on block storage and the other 90% (ideally) would be stashed on cloud files.

I haven't heard of 'st' either... link?

danielb2 commented 13 years ago

looks like S3 is cheaper anyway, so I'm looking at that now.

elescondite commented 13 years ago

Not so sure about that and I am no Amazon fan (after years of beating my head against their wall of unecessary complications and hideous documentation) ;-)

In my case, the bandwidth cost between my cloud servers and AWS would be prohibitive as the servers will definitely be staying with Rackspace.

BTW. The term "snet" refers to rackspace's "service net". In other words, the 10.x.x.x network between servers and cloud files. The beauty is, they don't charge for bandwidth on the service net, but there is a speed cap based on the size of the servers involved.

intel352 commented 13 years ago

On 1gb instance, snet was faster for me than pub.

Driving now, I'll send info on st tool shortly.

intel352 commented 13 years ago

Sorry, meant to post about the 'st' tool.

How-to is here: http://www.rackspace.com/knowledge_center/index.php/Does_Cloud_Files_support_large_file_transfer

Read carefully, as they have links within the text (like "download here") for you to fetch the st tool, which is actually a script from a larger library.

Works great for me.

amitn322 commented 12 years ago

I used cloud fuse to mount the cloud and I have issues with rsync. Rsync doesnot work at all, while a plain cp works very well. Any thoughts on that ?

rsync just hangs and errors out at last ..