Closed suedi closed 8 years ago
additional info.
Thanks for letting me know, and thank you for the logs. I'll try to duplicate it locally.
That sounds great man!
I closed #87 as it seems to be the same as this one. My tails of vdevd.log and dmesg in the txt file.
part of strace of vdevd -1 -c /etc/vdev/vdevd,conf /tmp/dev
The rest in file strace-vdevd.txt
lstat("/mnt/live/tmp/dev/snd/timer", 0x7ffec26551b0) = -1 ENOENT (No such file or directory)
write(2, "05245:00007FE8038E2700: [ "..., 13305245:00007FE8038E2700: [ vdev.c:0774] vdev_load_device_requests: DEBUG: Preseed device: 'c snd/seq 116 1 SUBSYSTEM=sound'
) = 133
lstat("/mnt/live/tmp/dev/snd/seq", 0x7ffec26551b0) = -1 ENOENT (No such file or directory)
munmap(0x7fe8037e0000, 1052672) = 0
open("/tmp/vdevd.log", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
lseek(3, 0, SEEK_END) = 0
close(1) = 0
close(2) = 0
dup2(3, 1) = 1
dup2(3, 2) = 2
close(3) = 0
pipe([3, 4]) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fe8038e29d0) = 5396
close(4) = 0
read(3, 0x7ffec265a3ac, 4) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=5396, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
read(3
Here it stops and wait for approximately 15 seconds then comes.
, "\0\0\0\0", 4) = 4
fstat(1, {st_mode=S_IFREG|0644, st_size=7117802, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe8038fc000
write(2, "05245:00007FE8038E2700: [ "..., 84) = 84
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x2d0} ---
+++ killed by SIGSEGV +++
Segmentation fault
Looks like vdev_preseed_run finish OK and I get everything in /tmp/dev which tells me the child has run coldplug-only OK.
So its a problem with pipe between parent and child??
Digged down further and the problem arises in "parent vdev instance"-->"vdev.c/vdev_shutdown()"->"vdev.c/vdev_error_thread_stop()"
This makes sense since that was part of what you added right and the seg fault is reported in libpthread.
Okay in vdev.c/vdev_error_thread_stop()
segfaulting on rc = pthread_cancel( vdev->error_thread );
if I comment out pthread_cancel and pthread_join code segments vdev finishes without a hitch
Why pthread_cancel is segfaulting I dunnow.
For some reason vdev->error_thread is NULL(or atleast 0)???
calling pthread_cancel on this cannot be good?
Probably cause vdev.c/vdev_start()-->vdev_error_thread_start() is only called
in if( !is_parent || vdev.config->foreground || vdev.config->coldplug_only ) {
And here is called
rc = pthread_create( &vdev->error_thread, &attrs, vdev_error_thread_main, &vdev->error_fd );
That fills in the value in vdev->error_thread in struct vdev_state
Here the fork is already done and as I understand it parent and child has each a copy of
struct vdev_state
so vdev->error_thread only gets a value in child but the parent
is the one calling pthread_cancel()
Hi @suedi,
Thank you for taking the time to run strace
on vdev and help figure this out (since I haven't been able to reproduce it on my machine). I have an idea of what might be happening.
A while back, I modified vdev to gather error messages from helpers and send them to vdev's logging facility (i.e. so they can be sent to syslog if desired). It does this by creating a FIFO at /dev/metadata/err.pipe
, redirecting its helpers' stderr streams to it, and running a thread (error_thread) to continuously read error messages from the FIFO and write them to either vdev's stderr or syslog.
What I think is happening is that when vdev starts as a daemon, it fork()s and the child starts up the error thread while the parent does not. But when the parent exits, it tries to stop its (nonexistant) error thread, and segfaults. I just pushed a patch that checks to see if the error thread was previously started before trying to stop it. Can you let me know if that helps?
Thanks again!
checking...
Solves it :+1:
Easy peasy eeeyh?
You really should have been able to reproduce it on your machin if running vdev in daemon mode (not coldplug-only)
Heh, fixing it is easy; finding the source of the problem not so much ;)
regarding commit 1f17250f5185de9bb24f5b2367f8fdf10c723f51
I am closing out #84 since I think maybe this is a different issue???? With this new commit I am able to boot to desktop with no apparent problems. I have a vdev running that except plugging in devices when I get there however in my boot.log I find
and in system log
gathered logs here https://www.dropbox.com/s/jnwzloioiofl4px/vdev-20160105-logs.tar.gz?dl=1
vdevd.log contain my usual errors i.e. "no pvs in path", "blkid not working on sdc2" and "no cdrom group". (The last one I have fixed since by editing optical.act and change to
VAR_OPTICAL_GROUP=optical
)