bcpierce00 / unison

Unison file synchronizer
GNU General Public License v3.0
4.08k stars 229 forks source link

Best-effort synchronization: Damaged drive issue #935

Closed stdedos closed 1 year ago

stdedos commented 1 year ago

Hello there,

I'd like to propose a consideration for the syncing algorithm. Feel free to ignore me if that feels too much.

When synchronizing (via terminal), I encountered the following issue:

$ unison /mnt/root-a /mnt/root-b -perms 0 -auto
Unison 2.53.3 (ocaml 4.14.1): Contacting server...                                                    
Looking for changes                                                                                   
Reconciling changes                                                                                   

root-a      root-b                                                                              
       error            a/b/c/d/e.jpg                
Error in digesting /mnt/root-a/a/b/c/d/e.jpg:       
Value too large for data type                                                                         
new file ---->            a/b/c/d/e*.jpg               

1 items will be synced, 1 skipped                  
2.3 MiB to be synced from root-a to root-b                                                      
0 B to be synced from root-b to root-a                                                          

Proceed with propagating updates? [] y             
Propagating updates                                                                                                                                                                                         

Unison 2.53.3 (ocaml 4.14.1) started propagating changes at 11:02:20.80 on 09 Jun 2023
[ERROR] Skipping a/b/c/d/e.jpg                         
  Error in digesting /mnt/root-a/a/b/c/d/e.jpg:     
Value too large for data type                                                                         
[BGN] Copying a/b/c/d/e*.jpg from /mnt/root-a to /mnt/root-b
Failed [a/b/c/d/e*.jpg]: Error in copying locally:     
Invalid argument [open(/mnt/root-b/a/b/c/d/.unison.e*.jpg.6b5b6fbc8e15ffae793d6a68adebe9c0.unison.tmp)]
Unison 2.53.3 (ocaml 4.14.1) finished propagating changes at 11:02:20.80 on 09 Jun 2023, 0.002 s      

Saving synchronizer state                                                                             
Synchronization incomplete at 11:02:20  (0 items transferred, 1 skipped, 1 failed)                    
  skipped: a/b/c/d/e.jpg (Error in digesting /mnt/root-a/a/b/c/d/e.jpg:
Value too large for data type)                                                                        
  failed: a/b/c/d/e*.jpg                               

In this case, there is some weird issue with /mnt/root-a/a/b/c/d/e.jpg file. However, because of that, everything after that (at some level, I'd say within the same top-level folder /a/) fails to get copied. Instead of failing at that context - would it be so complicated that Unison instead skipped that file altogether - but continued to finish with its full operation?

I had to try invoking Unison a lot of times (5+), manually ignore that file - and still it was resisting to fully consider bringing the replicas up-to-sync.

I don't have all of the unison invocations; this one looked the most interesting one to create this Issue.

(YMMV: at some point, for reasons unknown, even a simple touch a failed at the new drive/replica. But before, and after that operations were working. As this is an one-off, I do not have a stable environment I could try to make my issue replicatable)

gdt commented 1 year ago

Please bring this up on the mailinglist, as the wiki documents, and I expect others will chime in. It's possible there is a valid enhancement here, in trying other operations after what currently appears to be a fatal error. But I am not inclined to spend effort to try to improve operations on broken filesystems -- it seems that what you need to do is fix the filesystem, not just for unison.

tleedjarv commented 1 year ago

There is no enhancement here because Unison is already supposed to work exactly like you had expected it to: errors with single files are just that, errors with single files. Single file errors are not supposed to prevent other files from being synced. If you are seeing errors reported for each file then it means that syncing of each of those files resulted in an error (independently from others, from Unison's point of view). If Unison would simply crash, not even try to sync other files, or not report the correct number of files in the final status ("0 items transferred, 1 skipped, 1 failed") then that would be a bug.

It is very likely that your filesystem is broken, as @gdt suggested. The most obvious evidence for this is that even touch failed. But I would like to make sure there really is no bug here before closing the ticket. Please provide the information requested below so that we can definitely rule out bugs in Unison.

My guess is that this is a fat or exfat filesystem and some filenames have characters in them that are not permitted by Windows (and by extension may not be permitted by the fs driver). For example, e*.jpg is such a name that is allowed in Unix but strictly forbidden in Windows. Are you perhaps using WSL or Cygwin?

The "Value too large for data type" error seems to be an EOVERFLOW reported by open(2) (or possibly by read(2)) which most likely means that your filesystem is corrupted and thinks the file is too large. What do ls and stat output as size for files that produce this error?