nextcloud / server

☁️ Nextcloud server, a safe home for all your data
https://nextcloud.com
GNU Affero General Public License v3.0
26.25k stars 3.96k forks source link

Delta-sync, or sync only changed bytes in a file #417

Open exokkk opened 8 years ago

exokkk commented 8 years ago

Hi all,

Delta Sync would be great for my truecrypt/veracrypt huge files (~30-100gb). Without delta sync I must stay absent from this product.

Delta Sync would also provide you a feature to distinguish from owncloud

Couldn't there be an optional (maybe extension / folder / file -based) mechanism to perform Delta Sync ("optionally" as I agree that Delta Sync does not make sense for all kind of files/folders)? Maybe even using something existing like rsync?

RedKage commented 4 years ago

the only scenario's I can think of are VM's and encrypted filesystems, both of which are never used by the vast majority of computer users.

The scenario i deal with is big Outlook files. I could think that is used by more users.

Yes this, or Thunderbird emails as well, or large zip files which you update, or large pictures composition like PSD which you update, etc. Anything large which receives data updates. PDFs also, docx, or mp3/flacs for which you updates the ID3 tags... Veracrypt containers, isos...

There are endless cases.

In a broader way, any update on any file is a delta sync. Except for when the updated file has all its bits changed, only then, you can consider this update as an 'overwrite'. In effect delta sync would be used for almost 99% of the updates made to files. I think.

tehXor commented 4 years ago

There are endless cases.

Read the Nextcloud Case Studies. They have tens of thousand users from universities and the like where 99% of the users hardly have more than one file besides the intro file in their account and use it accordingly.

So even if you would say that delta-sync could save a multiple of traffic also with only small to medium files at scale, this is just not a relevant use case for Nextcloud since they don't target a such active user base.

Therefore your point is invalid and @jospoortvliet reasoning should get more credit. In fact Nextcloud should probably drop this item completely from their roadmap to focus more on what is important for their users and strengthen their USP. (After all there are other solutions which have delta-sync even with block based approaches, which can be used if you have a use case which requires that.)

jospoortvliet commented 4 years ago

I think it was explained before but:

So almost all common file types, including office documents (yes they are compressed), images, music and large PSD files etc do not benefit from it. A metadata change to a large movie might (not always, depends on the file format) and sometimes to large images, too. But how often do you do that? Once a month? It is really almost exclusively nice for VM images and encrypted container formats. And yes, they matter, but aren't the most important in the world for most of our users, sorry.

Look, customers use Nextcloud in many ways. SIEMENS for example uses it only with HUGE files (minimum 30 gigabyte, typically 50-100gb). Some media companies use it with PSD files of hundreds of MB's. If we could make those cases much more efficient with deltasync, we would look into it, but it wouldn't make a difference so we don't.

There is little point in discussing this further. We have a lot of work to do and until we have a larger team and have finished other tasks, we won't get to this. If somebody else wants to do it - please, go ahead, pull requests are welcome. If somebody wants to pay for it, get in contact with sales.

jggc commented 1 year ago

Since there is no mention of this use case yet in this thread : CAD files.

We are creating a lot of .rvt files that are 99% block duplicates of previous versions.

With current client it takes a few minutes to sync, with Syncthing it takes 2-3 seconds.

I opened a forum thread about our specific setup but just posting here to keep this alive and maybe bring a business use case with it.

I would be interested in backporting the fixes from owncloud if some people are ready to sponsor this.

rrauenza commented 1 year ago

Since there is no mention of this use case yet in this thread : CAD files.

Adobe Lightroom database is another one. It's an sqlite database that mostly just gets appends.