jcnelson / vdev

A device-file manager for *nix
GNU General Public License v3.0
101 stars 13 forks source link

Segfault with "warning 139" #86

Closed suedi closed 8 years ago

suedi commented 8 years ago

regarding commit 1f17250f5185de9bb24f5b2367f8fdf10c723f51

I am closing out #84 since I think maybe this is a different issue???? With this new commit I am able to boot to desktop with no apparent problems. I have a vdev running that except plugging in devices when I get there however in my boot.log I find

/etc/init.d/vdev: line 88:  2001 Segmentation fault      "$VDEV_BIN" -c "$VDEV_CONFIG" $@ "$VDEV_MOUNTPOINT"
139 ... (warning).
Waiting 15 seconds and trying to continue anyway ... (warning).

and in system log

vdevd[2001]: segfault at 2d0 ip 00007f7d2a216240 sp 00007ffcbb4f80f8 error 4 in libpthread-2.22.so[7f7d2a208000+18000]

gathered logs here https://www.dropbox.com/s/jnwzloioiofl4px/vdev-20160105-logs.tar.gz?dl=1

vdevd.log contain my usual errors i.e. "no pvs in path", "blkid not working on sdc2" and "no cdrom group". (The last one I have fixed since by editing optical.act and change to VAR_OPTICAL_GROUP=optical)

suedi commented 8 years ago

additional info.

  1. Does not affect coldplug-only
jcnelson commented 8 years ago

Thanks for letting me know, and thank you for the logs. I'll try to duplicate it locally.

suedi commented 8 years ago

That sounds great man!

onimatrix81 commented 8 years ago

I closed #87 as it seems to be the same as this one. My tails of vdevd.log and dmesg in the txt file.

vdevdsegfault.txt

suedi commented 8 years ago

part of strace of vdevd -1 -c /etc/vdev/vdevd,conf /tmp/dev The rest in file strace-vdevd.txt

lstat("/mnt/live/tmp/dev/snd/timer", 0x7ffec26551b0) = -1 ENOENT (No such file or directory)
write(2, "05245:00007FE8038E2700: [       "..., 13305245:00007FE8038E2700: [          vdev.c:0774] vdev_load_device_requests: DEBUG: Preseed device: 'c snd/seq 116 1  SUBSYSTEM=sound'
) = 133
lstat("/mnt/live/tmp/dev/snd/seq", 0x7ffec26551b0) = -1 ENOENT (No such file or directory)
munmap(0x7fe8037e0000, 1052672)         = 0
open("/tmp/vdevd.log", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
lseek(3, 0, SEEK_END)                   = 0
close(1)                                = 0
close(2)                                = 0
dup2(3, 1)                              = 1
dup2(3, 2)                              = 2
close(3)                                = 0
pipe([3, 4])                            = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fe8038e29d0) = 5396
close(4)                                = 0
read(3, 0x7ffec265a3ac, 4)              = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5396, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
read(3

Here it stops and wait for approximately 15 seconds then comes.

           , "\0\0\0\0", 4)                  = 4
fstat(1, {st_mode=S_IFREG|0644, st_size=7117802, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe8038fc000
write(2, "05245:00007FE8038E2700: [       "..., 84) = 84
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x2d0} ---
+++ killed by SIGSEGV +++
Segmentation fault

Looks like vdev_preseed_run finish OK and I get everything in /tmp/dev which tells me the child has run coldplug-only OK.

So its a problem with pipe between parent and child??

Digged down further and the problem arises in "parent vdev instance"-->"vdev.c/vdev_shutdown()"->"vdev.c/vdev_error_thread_stop()"

This makes sense since that was part of what you added right and the seg fault is reported in libpthread.

suedi commented 8 years ago

Okay in vdev.c/vdev_error_thread_stop()

segfaulting on rc = pthread_cancel( vdev->error_thread );

if I comment out pthread_cancel and pthread_join code segments vdev finishes without a hitch

Why pthread_cancel is segfaulting I dunnow.

suedi commented 8 years ago

For some reason vdev->error_thread is NULL(or atleast 0)???

calling pthread_cancel on this cannot be good?

Probably cause vdev.c/vdev_start()-->vdev_error_thread_start() is only called in if( !is_parent || vdev.config->foreground || vdev.config->coldplug_only ) {

And here is called rc = pthread_create( &vdev->error_thread, &attrs, vdev_error_thread_main, &vdev->error_fd ); That fills in the value in vdev->error_thread in struct vdev_state

Here the fork is already done and as I understand it parent and child has each a copy of struct vdev_state so vdev->error_thread only gets a value in child but the parent is the one calling pthread_cancel()

jcnelson commented 8 years ago

Hi @suedi,

Thank you for taking the time to run strace on vdev and help figure this out (since I haven't been able to reproduce it on my machine). I have an idea of what might be happening.

A while back, I modified vdev to gather error messages from helpers and send them to vdev's logging facility (i.e. so they can be sent to syslog if desired). It does this by creating a FIFO at /dev/metadata/err.pipe, redirecting its helpers' stderr streams to it, and running a thread (error_thread) to continuously read error messages from the FIFO and write them to either vdev's stderr or syslog.

What I think is happening is that when vdev starts as a daemon, it fork()s and the child starts up the error thread while the parent does not. But when the parent exits, it tries to stop its (nonexistant) error thread, and segfaults. I just pushed a patch that checks to see if the error thread was previously started before trying to stop it. Can you let me know if that helps?

Thanks again!

suedi commented 8 years ago

checking...

suedi commented 8 years ago

Solves it :+1:

Easy peasy eeeyh?

You really should have been able to reproduce it on your machin if running vdev in daemon mode (not coldplug-only)

jcnelson commented 8 years ago

Heh, fixing it is easy; finding the source of the problem not so much ;)