bcpierce00 / unison

Unison file synchronizer
GNU General Public License v3.0
4.16k stars 235 forks source link

Segfault when syncing #62

Closed whirm closed 7 years ago

whirm commented 7 years ago

This is with current HEAD (d860a697fcd507cabae25dfded22dd03f5a6d920):

─> gdb --args ./src/unison data-0k -servercmd ~/.bin/unison    
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./src/unison...done.
(gdb) r
Starting program: /tmp/unison/src/unison data-0k -servercmd /home/whirm/.bin/unison
Unison 2.50.0 (ocaml 4.04.0): Contacting server...
Connected [//0k//var/data -> //1k//var/data]
Looking for changes

Program received signal SIGSEGV, Segmentation fault.
0x00005555557124c0 in intern_rec (dest=0x7ffff111f6e8, dest@entry=0x7fffffffd1f8) at intern.c:430
430 intern.c: No such file or directory.
(gdb) bt
#0  0x00005555557124c0 in intern_rec (dest=0x7ffff111f6e8, dest@entry=0x7fffffffd1f8) at intern.c:430
#1  0x0000555555712ccf in caml_input_val_core (chan=chan@entry=0x555555af21c0, outside_heap=outside_heap@entry=0)
    at intern.c:734
#2  0x0000555555712e67 in caml_input_val (chan=0x555555af21c0) at intern.c:749
#3  caml_input_value (vchan=<optimized out>) at intern.c:759
#4  0x0000555555659636 in camlUpdate__fun_4498 () at /tmp/unison/src/update.ml:335
#5  0x000055555569a251 in camlUtil__convertUnixErrorsToExn_1955 () at /tmp/unison/src/ubase/util.ml:170
#6  0x000055555565b48e in camlUpdate__fun_4710 () at /tmp/unison/src/update.ml:722
#7  0x0000555555668917 in camlGlobals__fun_2050 () at /tmp/unison/src/globals.ml:121
#8  0x000055555569248d in camlLwt__apply_1225 () at /tmp/unison/src/lwt/lwt.ml:75
#9  0x000055555569274e in camlLwt__fun_1451 () at /tmp/unison/src/lwt/lwt.ml:94
#10 0x00005555556ba741 in camlList__iter_1252 () at list.ml:77
#11 0x000055555569216e in camlLwt__restart_1211 () at /tmp/unison/src/lwt/lwt.ml:31
#12 0x000055555568e5d2 in camlLwt_unix_impl__restart_threads_1278 () at /tmp/unison/src/lwt/lwt.ml:83
#13 0x000055555568eca1 in camlLwt_unix_impl__run_1579 () at /tmp/unison/src/lwt/generic/lwt_unix_impl.ml:147
#14 0x000055555563d5b8 in camlUitext__synchronizeOnce_1968 () at /tmp/unison/src/update.ml:2096
#15 0x000055555563df8a in camlUitext__loop_2237 () at /tmp/unison/src/uitext.ml:788
#16 0x000055555563e18d in camlUitext__synchronizeUntilDone_2242 () at /tmp/unison/src/uitext.ml:810
#17 0x000055555563e437 in camlUitext__start_2249 () at /tmp/unison/src/uitext.ml:870
#18 0x0000555555635c3a in camlMain__Body_1550 () at /tmp/unison/src/main.ml:241
#19 0x00005555556350d3 in camlLinktext__entry () at /tmp/unison/src/linktext.ml:19
#20 0x00005555556319a9 in caml_program ()
#21 0x000055555571cc94 in caml_start_program ()
#22 0x000055555571d015 in caml_main (argv=0x7fffffffd6d8) at startup.c:145
#23 0x000055555563126c in main (argc=<optimized out>, argv=<optimized out>) at main.c:37
(gdb) 

The unison binary was scp'ed to the remote host.

Note that Debian's packaged version (2.48.3) is segfaulting too but I don't have debugging symbols for it so I can't check if the stack trace is the same one.

Please let me know if I can provide any more info that may help.

Thanks!

brabalan commented 7 years ago

Is it possible to build unison on the server? (I often scp binaries, so it should be fine, but I want to make sure.)

whirm commented 7 years ago

I'll try that when I get home.

whirm commented 7 years ago

I built it from git on both sides and it crashes too (This is with current master HEAD d860a697fcd507cabae25dfded22dd03f5a6d920)

└──> gdb --args ./src/unison data-0k -servercmd /tmp/unison/src/unison
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git                    
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./src/unison...done.
(gdb) r
Starting program: /tmp/unison/src/unison data-0k -servercmd /tmp/unison/src/unison
Unison 2.50.0 (ocaml 4.04.0): Contacting server...
Connected [//0k//var/data -> //1k//var/data]
Looking for changes

Program received signal SIGSEGV, Segmentation fault.
0x00005555557124c0 in intern_rec (dest=0x7ffff111f6e8, dest@entry=0x7fffffffcd48) at intern.c:430
430 intern.c: No such file or directory.
(gdb) bt
#0  0x00005555557124c0 in intern_rec (dest=0x7ffff111f6e8, dest@entry=0x7fffffffcd48) at intern.c:430
#1  0x0000555555712ccf in caml_input_val_core (chan=chan@entry=0x555555af21c0, outside_heap=outside_heap@entry=0)
    at intern.c:734
#2  0x0000555555712e67 in caml_input_val (chan=0x555555af21c0) at intern.c:749
#3  caml_input_value (vchan=<optimized out>) at intern.c:759
#4  0x0000555555659636 in camlUpdate__fun_4498 () at /tmp/unison/src/update.ml:335
#5  0x000055555569a251 in camlUtil__convertUnixErrorsToExn_1955 () at /tmp/unison/src/ubase/util.ml:170
#6  0x000055555565b48e in camlUpdate__fun_4710 () at /tmp/unison/src/update.ml:722
#7  0x0000555555668917 in camlGlobals__fun_2050 () at /tmp/unison/src/globals.ml:121
#8  0x000055555569248d in camlLwt__apply_1225 () at /tmp/unison/src/lwt/lwt.ml:75
#9  0x000055555569274e in camlLwt__fun_1451 () at /tmp/unison/src/lwt/lwt.ml:94
#10 0x00005555556ba741 in camlList__iter_1252 () at list.ml:77
#11 0x000055555569216e in camlLwt__restart_1211 () at /tmp/unison/src/lwt/lwt.ml:31
#12 0x000055555568e5d2 in camlLwt_unix_impl__restart_threads_1278 () at /tmp/unison/src/lwt/lwt.ml:83
#13 0x000055555568eca1 in camlLwt_unix_impl__run_1579 () at /tmp/unison/src/lwt/generic/lwt_unix_impl.ml:147
#14 0x000055555563d5b8 in camlUitext__synchronizeOnce_1968 () at /tmp/unison/src/update.ml:2096
#15 0x000055555563df8a in camlUitext__loop_2237 () at /tmp/unison/src/uitext.ml:788
#16 0x000055555563e18d in camlUitext__synchronizeUntilDone_2242 () at /tmp/unison/src/uitext.ml:810
#17 0x000055555563e437 in camlUitext__start_2249 () at /tmp/unison/src/uitext.ml:870
#18 0x0000555555635c3a in camlMain__Body_1550 () at /tmp/unison/src/main.ml:241
#19 0x00005555556350d3 in camlLinktext__entry () at /tmp/unison/src/linktext.ml:19
#20 0x00005555556319a9 in caml_program ()
#21 0x000055555571cc94 in caml_start_program ()
#22 0x000055555571d015 in caml_main (argv=0x7fffffffd228) at startup.c:145
#23 0x000055555563126c in main (argc=<optimized out>, argv=<optimized out>) at main.c:37
(gdb) 
brabalan commented 7 years ago

This is very surprising, I'm not seeing anything like this on my installation.

Could you try to bisect the bug to a particular file? Or does it always happen independently of the files synchronized?

whirm commented 7 years ago

I ran it with strace and it segfaults while reading the local archive file.

whirm commented 7 years ago

I renamed the archive files on both sides and after a couple segfaults (sorry I wasn't running on gdb) the scan succeeded and I could sync some files. I guess somehow the archive file got so badly corrupted on a previous segfault that unison will crash when attempting to parse it?

brabalan commented 7 years ago

You could try removing the archive. Unison will sync every file is new, so you might have to manually solve some conflicts.

whirm commented 7 years ago

Maybe I didn't explain myself correctly. That's what I did, I renamed the archive files so Unison would generate new ones. It crashed a couple times while reindexing the whole thing (dozens of GB of files of all sizes) and then I marked most of the pending changes as skip (I had no time to go through the list at the time) and let it sync just a few of the changes. That worked and I could restart Unison afterwards without any crashes. I've been using Unison for many years in several machines with zero crashes until recently. (Thanks a lot for it BTW, it's one of my cornerstone tools)

brabalan commented 7 years ago

Sorry, I misread. So if I understand correctly, the crash was because of archive corruption and now all is fine. Is this correct?

whirm commented 7 years ago

Well, I got a couple crashes after that, but I wasn't running Unison on GDB so I couldn't obtain debugging info.

It's a pity I couldn't collect debug info when Unison crashed and corrupted the archive.

I guess I will close this issue and enable core dumping to see if I can catch it crashing again.

Thanks!