jcnelson / vdev

A device-file manager for *nix
GNU General Public License v3.0
101 stars 13 forks source link

[warn] 6 ... (warning). #84

Closed suedi closed 8 years ago

suedi commented 8 years ago

I pulled the latest vdev and ran into some problems.

vdevd dies during boot and I am left at console with no /dev/dri After trying to re-start vdevd a couple of more times /dev/dri appears and I can start desktop but vdevd dies with error every time I try to start it (vdevd that is).

I get the somewhat cryptic message of the title

[warn] 6 ... (warning).

I think the error it dies on is reported in vdevd.log as

01993:00007F1B4B470700: [          main.c:0228] main: ERROR: device quiesce failure, child rc = 1

I gathered logs, there are vdevd.log's both at the warn and the debug level.

https://www.dropbox.com/s/sxldnq5qx5kkbel/vdev-daemonlet-fix-error.tar.gz?dl=1

archive containing

vdev-daemonlet-fix-error/session-0.log
vdev-daemonlet-fix-error/udev-compat.log
vdev-daemonlet-fix-error/boot.log
vdev-daemonlet-fix-error/vdevd_debug.log
vdev-daemonlet-fix-error/vdevd.log

EDIT

I have in /dev/metadata

prw------- 1 root root 0 Nov 30 17:46 err.pipe|
Jamessall commented 8 years ago

I also tried after last night's commits...got a kernel panic, no log. Haven't had time to look into it further.

Jim

jcnelson commented 8 years ago

@suedi, thank you for trying the latest build.

[warn] 6 ... (warning). That's the exit code you'd get when the parent process fails to read from the child it forked to handle coldplug processing.

Do you get a non-zero exit code from vdevd (i.e. what are the contents of $?)? Also, is vdevd failing with a segmentation fault? Can you try running it with --coldplug-only to see if this is specific to the daemon-ization code path? I ask because I can't reproduce it locally (but I haven't tried booting with this build yet).

Some of the data structure layouts changed in the last patch set. Please confirm that you are rebuilding and re-installing everything from scratch with this patch set.

I have in /dev/metadata... This is intentional. Vdevd now redirects its subprocess' stderr streams to this pipe, so it can consume them and forward them to its log file (or syslog). This lets the helper programs send error messages to syslog without having to alter them.

I see a lot of ERROR: /bin/chown: invalid group: 'root:input' strings in your logs. Is there an input group on your system? If not, you should edit your new /etc/vdev/actions/input.act file and change the VAR_INPUT_OWNER and VAR_INPUT_GROUP values accordintly (this was the fix I alluded to in #82).

Also, if you get a chance, can you send me the output of /sbin/blkid /dev/sdc2 on your system? It looks like disk.sh is choking on it, given these lines:

02131:00007F1B4AAA5700: [          vdev.c:0395] vdev_error_thread_main: ERROR: /usr/lib/vdev/disk.sh: eval: line 530: syntax error near unexpected token `('
/usr/lib/vdev/disk.sh: eval: line 530: `APPLICATION_ID=GENISOIMAGE\ ISO\ 9660/HFS\ FILESYSTEM\ CREATOR\ (C)\ 1993\ E.YOUNGDALE\ (C)\ 1997-2006\ J.PEARSON/J.SCHILLING\ (C)\ 2006-2007\ CDRKIT\ TEAM'
suedi commented 8 years ago

I get no indication of exit code of vdevd in any logs. I start it during boot with /etc/init.d/vdev start I will try run it from cmd-line both as coldplug-only and as a daemon and report exit codes or segfaults.

I rebuilt from scratch everything except hwdb which I have the version that obarun made a pull request on.

I had no input group at all (now I have) but I got your allusion and tried it with replacing cdrom with optical. This seems to work.

I think we can ignore the error on sdc2. It is a special entry in partition table used by easy2boot. With grub4dos partnew command an iso is mapped to a partition entry and can thus be booted from directly.

I will of course share info about it when I come home. I will also check how udev handles sdc2

EDIT

> /sbin/blkid /dev/sdc2
/dev/sdc2: PARTUUID="f98bb243-02"

I booted with udev and could not mount it here either. in dmessg I have

> dmesg | grep sdc2
[    4.367307]  sdc: sdc1 sdc2
[   32.226549] REISERFS warning (device sdc2): sh-2006 read_super_block: bread failed (dev sdc2, block 128, size 512)
[   32.234179] REISERFS warning (device sdc2): sh-2006 read_super_block: bread failed (dev sdc2, block 128, size 512)
[   32.240677] isofs_fill_super: bread failed, dev=sdc2, iso_blknum=16, block=32
[   64.004738] REISERFS warning (device sdc2): sh-2006 read_super_block: bread failed (dev sdc2, block 128, size 512)
[   64.011845] REISERFS warning (device sdc2): sh-2006 read_super_block: bread failed (dev sdc2, block 128, size 512)
[   64.018612] isofs_fill_super: bread failed, dev=sdc2, iso_blknum=16, block=32

I think this error can be ignored, maybe the iso mapped with partnew command is deleted I don't remember.

suedi commented 8 years ago

I tried running as --coldplug-only. This did work OK and I get exit code 0.

looking at log I only get two errors at warn loglevel. The above one on sdc2 that is known and one on loop35 I have not seen before. loop35 is not used so this is somewhat strange

 > losetup -f
/dev/loop35
03725:00007F6161110700: [          vdev.c:0395] vdev_error_thread_main: ERROR: [helpers] [WARN]: /sbin/blkid failed on /dev/loop35, exit status 2

This loop error could be a secondary failure due to me having to run vdev several times to get a starting system?

When I thought more about your question about exit code, it seemed strange. Value of vdevd exit code as a daemon is clear since after the line of code outputting ERROR: device quiesce failure there is a line with exit(6); so of course value of $? is 6, right?????

As I read the code 6 is the parents exit code and then I guess the exit code value of the child is rc = 1

03708:00007F6161ADB700: [          main.c:0228] main: ERROR: device quiesce failure, child rc = 1

I made a few test runs and it always bombs at exactly the same place. Every log ends with

22872:00007FF11EBEA700: [        action.c:1733] vdev_action_run_commands: DEBUG: Benchmark: action /etc/vdev/actions/zzz-event-push.act succeeded in 11 millis
22872:00007FF11EBEA700: [        action.c:0651] vdev_action_run_sync: DEBUG: run command: '/usr/lib/vdev/udev-compat.sh'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_MOUNTPOINT=/mnt/live/tmp/dev'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_ACTION=remove'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_PATH=UNKNOWN'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_METADATA=/mnt/live/tmp/dev/metadata//dev/UNKNOWN'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_GLOBAL_METADATA=/mnt/live/tmp/dev/metadata/'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_CONFIG_FILE=/etc/vdev/vdevd.conf'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_HELPERS=/usr/lib/vdev'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_LOGFILE=/root/vdevd2.log'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_INSTANCE=1NMLL971KLN9094150LMM82P73L32MM6O02L0773M8766M570MLL995NLK2N108P'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_DAEMONLET=0'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_LOGLEVEL=debug'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_OS_DEVPATH=/module/pcc_cpufreq'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_OS_SEQNUM=1913'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_OS_SUBSYSTEM=module'
22872:00007FF11EBEA700: [        action.c:0654] vdev_action_run_sync: DEBUG: command env: 'VDEV_OS_SYSFS_MOUNTPOINT=/sys'
22872:00007FF11EBEA700: [        action.c:0659] vdev_action_run_sync: DEBUG: exit status 0
22872:00007FF11EBEA700: [        action.c:1733] vdev_action_run_commands: DEBUG: Benchmark: action /etc/vdev/actions/zzz-udev-compat.act succeeded in 32 millis
22720:00007FF11FDB6700: [          main.c:0228] main: ERROR: device quiesce failure, child rc = 1

Just checked dmesg and found , probably from several runs

[ 3282.488331] vdevd[17520]: segfault at 128 ip 00007f756aa980e3 sp 00007f7569cdee28 error 4 in libpthread-2.22.so[7f756aa8d000+18000]
[ 3558.656379] vdevd[22684]: segfault at 2d0 ip 00007f819ced4240 sp 00007fffb0d85778 error 4 in libpthread-2.22.so[7f819cec6000+18000]
[ 3611.036493] vdevd[22874]: segfault at 128 ip 00007ff11f9a30e3 sp 00007ff11ebe9e28 error 4 in libpthread-2.22.so[7ff11f998000+18000]

What to do?

Obarun commented 8 years ago

compilation error for me with the last build

In file included from /usr/include/limits.h:25:0, from /usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/include-fixed/limits.h:168, from /usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/include-fixed/syslimits.h:7, from /usr/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/include-fixed/limits.h:34, from /home/obarun/Archives/pkgbuild/vdev-obarun/src/vdev-git/libvdev/util.h:37, from acl.h:25, from acl.c:22: /usr/include/features.h:148:3: warning: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Wcpp]

warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"

^ acl.c: In function ‘vdev_acl_run_predicate’: acl.c:822:59: warning: passing argument 5 of ‘vdev_subprocess’ makes integer from pointer without a cast [-Wint-conversion] rc = vdev_subprocess( cmd_buf, predicate_env, NULL, 0, &exitstatus, true ); ^ In file included from acl.h:25:0, from acl.c:22: /home/obarun/Archives/pkgbuild/vdev-obarun/src/vdev-git/libvdev/util.h:187:5: note: expected ‘int’ but argument is of type ‘int ’ int vdevsubprocess( char const cmd, char* const env[], char* output, size_t max_output, int stderr_fd, int* exit_status, bool use_shell ); ^ acl.c:822:73: warning: passing argument 6 of ‘vdev_subprocess’ makes pointer from integer without a cast [-Wint-conversion] rc = vdev_subprocess( cmd_buf, predicate_env, NULL, 0, &exitstatus, true ); ^ In file included from acl.h:25:0, from acl.c:22: /home/obarun/Archives/pkgbuild/vdev-obarun/src/vdev-git/libvdev/util.h:187:5: note: expected ‘int ’ but argument is of type ‘int’ int vdevsubprocess( char const cmd, char* const env[], char* output, size_t max_output, int stderr_fd, int* exit_status, bool use_shell ); ^ acl.c:822:9: error: too few arguments to function ‘vdev_subprocess’ rc = vdev_subprocess( cmd_buf, predicate_env, NULL, 0, &exit_status, true ); ^ In file included from acl.h:25:0, from acl.c:22: /home/obarun/Archives/pkgbuild/vdev-obarun/src/vdev-git/libvdev/util.h:187:5: note: declared here int vdev_subprocess( char const* cmd, char* const env[], char* output, size_t max_output, int stderr_fd, int\ exit_status, bool use_shell ); ^ Makefile:20: recipe for target '/home/obarun/Archives/pkgbuild/vdev-obarun/src/vdev-git/build/usr/sbin/acl.o' failed make: *\ [/home/obarun/Archives/pkgbuild/vdev-obarun/src/vdev-git/build/usr/sbin/acl.o] Error 1 make: Leaving directory '/home/obarun/Archives/pkgbuild/vdev-obarun/src/vdev-git/fs'

fbt commented 8 years ago

^

Yep, same here.

suedi commented 8 years ago

I guess you get that error in vdev/fs

I don't use vdevfs so I don't compile it either hence I'm not affected by that error

jcnelson commented 8 years ago

Sorry for the late reply--life got in the way :/

I forgot to plumb through the new subprocess code through vdevfs. If you're not already using vdevfs, then you can probably ignore this.

I'll try to fix it this weekend, time permitting.

suedi commented 8 years ago

That sounds good.

I hope you also get some time to test vdev in daemon mode to see if you can reproduce this thread's original conundrum.

suedi commented 8 years ago

I thought I saw something new and informative but I must erase this comment since I now think it was due to an error on my side

Excuses extended...

suedi commented 8 years ago

I am not a fan of intertwining issues in one thread however since that is the case right here. I will post here.

For the second issue of this thread: compilation error for me with the last build it is solved now with the Jude's latest commits, it works for vdev/fs

Thanks for the update @jude

I cannot close this issue though until the first issue of this thread is solved.

suedi commented 8 years ago

@jude

Are you able to reproduce this error with vdev in daemon mode i.e. during boot?

Obarun commented 8 years ago

builded successfully, thank @jcnelson. So tests can be continued :)

suedi commented 8 years ago

as of commit 1f17250f5185de9bb24f5b2367f8fdf10c723f51

This commit seems to be taking a big step forward, Good Job!

Now vdev works for me. :+1:

I found another issue #86 but it is to dissimilar to this one so closing