rclone / rclone

"rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files
https://rclone.org
MIT License
46.16k stars 4.13k forks source link

Rate-limiting FUSE requests when VFS cache is full #6005

Open vmsh0 opened 2 years ago

vmsh0 commented 2 years ago

The associated forum post URL from https://forum.rclone.org

https://forum.rclone.org/t/rate-limiting-the-vfs-cache-speed-to-prevent-the-local-disk-from-filling-up/23319

What is your current rclone version (output from rclone version)?

➜ rclone --version rclone v1.57.0

What problem are you are trying to solve?

I'm using a cloud drive mounted with rclone as the target for a backup software (namely Borg). While doing the first full backup, Borg generates an amount of data which is much greater than the disk space I have. Thus, the backup cannot complete, because I run out of space.

How do you think rclone should be changed to solve that?

I think rclone should implement the feature that was described on the forum last year. I.e., have a VFS cache mode flag that allows to throttle requests to the FUSE filesystem when the VFS cache is full.

How to use GitHub

Animosity022 commented 2 years ago

You can already do that with ionice commands to really throttle it down.

Feels like an awful solution though as just get the disk space you need to perform a backup.

ncw commented 2 years ago

I think @leoluan1 had a prototype for this?

As was said in the thread, it would be really easy to deadlock the filesystem doing this, so it is going to need careful surgery!

Inverness commented 1 year ago

I'm also interested in this myself as a Windows user.

I recently started researching cloud backup solutions for my home PC. That research has led me to using Rclone as a way to allow my existing backup software (Macrium Reflect) to utilize cloud storage. I want to be able to write disk images directly into the cloud storage.

The difference between VFS writes and upload speeds is the single big stumbling block I've run into.

Searching earlier today did lead me to that same discussion thread where @leoluan1 mentioned adding exponential backoff code to item.WriteAt. I went to look at the code to see myself what it was doing but I was disappointed to find that this was never actually introduced?

Animosity022 commented 1 year ago

@Inverness - that's why it has the help wanted tag as no one has helped to pick it up :)

rafa-dot-el commented 2 days ago

I'm also facing the same situation and considering alternatives for the problem. If I can have the proper guidance I'm willing to do the changes in the code myself and support this feature moving forward. I really appreciate all the work put into rclone and is an amazing piece of software. I would like to initiate a discussion on possible solutions for the problem as I'm not familiar with rclone inner architecture and code base, probably I got some wrong ideas, please correct me if I'm wrong. Please give me your thoughts on the comments below:

As a user I would like the cache to work as a dedicated space to cache locally (similar to a cache in a Raid), not exceeding its size and in the worst case falling back to relying on the remote for the writes. The system must be able to use the free cache if available to guaranteed a better performance, but not failing (IO error in the user space) in case the cache is completely full neither proceed filling up the whole disk.

For the first part I can achieve it by tuning the parameters like --vfs-cache-max-age to a really big number so the cache is filled but as I described later on this creates some inconsistencies with the idea and how to invalidate files.

The second part is harder given that massive writes fill the filesystem beyond the cache limit. An alternative is to copy directly the files to the remote but I see this as a workaround the actual issue and expectations.

One idea would be to add a parameter or cache mode, which strictly enforces the cache size limit and throttle the operations, something along the line of --vfs-cache-mode full-strict. Another idea on how to achieve this is adding a parameter as a threshold to initiate a throttle (something along the lines of --vfs-cache-throttle-threshold and --vfs-cache-throttle-bwlimit), where one of them set the trigger for the limit to be applied and the second sets the throttling speed. This would ensure that rclone is not considered hanging or frozen on MacOS and Windows since the writes still being performed at a much reduced speed but enough to satisfy the operating system. The parameter should be set to something much lower than the actual transfer rate to the remote unless it will fill up the disk anyways.

As one example of the above:

rclone mount s3:/test /mnt \
  --vfs-cache-throttle-threshold '90%' \
  --vfs-cache-throttle-bwlimit 100kbps \
  --vfs-cache-max-size 10G \
  --vfs-cache-mode full-strict \
  --cache-dir /var/cache/rclone

Explaining the intention of the command above: Once the usage of cache is 9gb (90%), the write speed must be decreased to 100kbps.

Some caveats:

I'm not familiar with rclone codebase, but digging around I found these two methods:

https://github.com/rclone/rclone/blob/874d66658ed3e614b8073b29299321be200a4d21/vfs/vfscache/cache.go#L739-L746

Adding a third method which would check the threshold (haveQuotaThreshold) and a method for checking if is within the range thresholdOK could be used as the base to create the validations required.

I could trace the method below to perform the writes (maybe I missed something?):

https://github.com/rclone/rclone/blob/874d66658ed3e614b8073b29299321be200a4d21/vfs/read_write.go#L329-L362

as a pseudo code before executing the write:

if haveQuotaThreshold() && thresholdOK() {
  proceed...
} else if {
  tryToFreeSomeCache() // Check problem [1]
  if thresholdOK() { // If some cache could be freed, proceed normally
    proceed... // See problem [2]
  } else {
    rateLimit.Lock() // Some shared object to guarantee consistency between threads
    writeSize := ...
    delay := writeSize / thresholdLimit // This write should take how many seconds?
    ... proceed with normal write ...
    sleep(delay)
    rateLimit.Unlock()
  }
}

Problems: