cmusatyalab / coda

Coda is an advanced networked filesystem. It has been developed at CMU since 1987 by the systems group of M. Satyanarayanan in the SCS department.
http://coda.cs.cmu.edu/
GNU General Public License v2.0
129 stars 21 forks source link

venus fails due to "CHILD: mount system call failed. Killing parent." #11

Open krichter722 opened 8 years ago

krichter722 commented 8 years ago
$ sudo /usr/local/sbin/venus -d 5

Date: Fri 06/10/2016

02:41:57 Coda Venus, version 6.9.6�*���<��02:41:57 /var/lib/coda/LOG size is 46821888 bytes�*���<��02:41:57 /var/lib/coda/DATA size is 187277488 bytes�*���<��02:41:57 Loading RVM data�*���<��02:41:57 Last init was Wed Jun  1 03:57:26 2016�*���<��02:41:57 Last shutdown was dirty�*���<��02:41:57 Starting RealmDB scan�*���<��02:41:57    Found 2 realms�*���<��02:41:57 starting VDB scan�*���<��02:41:57    5 volume replicas�*���<��02:41:57   3 replicated volumes�*���<��02:41:57    5 CML entries allocated�*���<��02:41:57     0 CML entries on free-list�*���<��02:41:58 starting FSDB scan (41666, 1000000) (25, 75, 4)�*���<��02:41:58  13 cache files in table (987136 blocks)�*���<��02:41:58     41653 cache files on free-list�*���<��02:41:58 starting HDB scan�*���<��02:41:58    0 hdb entries in table�*���<��02:41:58  0 hdb entries on free-list�*���<��02:41:58 Kernel version ioctl failed.�*���<��02:41:58 Mounting root volume...�*���<���<��02:41:58 CHILD: mount system call failed. Killing parent.
�*���<��Getötet

Adding -init doesn't help:

$ sudo /usr/local/sbin/venus -d 5 -init

Date: Fri 06/10/2016

02:49:53 Coda Venus, version 6.9.6�*���<��02:49:53 /var/lib/coda/LOG size is 46819372 bytes�*���<��02:49:54 /var/lib/coda/DATA size is 187277488 bytes�*���<��02:49:54 Initializing RVM data...�*���<��02:49:55 ...done�*���<��02:49:55 Loading RVM data�*���<��02:49:55 Starting RealmDB scan�*���<��02:49:55  Found 1 realms�*���<��02:49:55 starting VDB scan�*���<��02:49:55    0 volume replicas�*���<��02:49:55   0 replicated volumes�*���<��02:49:55    0 CML entries allocated�*���<��02:49:55     0 CML entries on free-list�*���<��02:49:56 starting FSDB scan (41666, 1000000) (25, 75, 4)�*���<��02:49:56  0 cache files in table (0 blocks)�*���<��02:49:56   41666 cache files on free-list�*���<��02:50:03 starting HDB scan�*���<��02:50:03    0 hdb entries in table�*���<��02:50:03  0 hdb entries on free-list�*���<��02:50:04 Kernel version ioctl failed.�*���<��02:50:04 Mounting root volume...�*���<���<��02:50:04 CHILD: mount system call failed. Killing parent.
�*���<��Getötet

experienced with rvm-1.18-20-g03750fd (according to git describe)

jaharkes commented 8 years ago

Most likely the Coda kernel module has not been loaded. The udev people decided long ago that having a virtual device that triggers loading of a kernel module was racy, so when venus tries to open /dev/cfs0 the kernel module is not autoloaded, and the following 'mount /coda' fails. The mount is run from a child process after the main process has started because the kernel needs to call back up to resolve the root inode before the mount can complete so the main process cannot make the mount systemcall.

jaharkes commented 8 years ago

Oh, forgot to mention the solution, it is to run 'modprobe coda' before starting the venus process.

stephenrkell commented 4 years ago

I just did checkout, build and install of coda 7.1.0, successfully set up the server, and have now hit this issue. However, I do have the coda kernel module installed.

Could this be a version mismatch perhaps? I notice my coda kernel module is only version 6.6.

$ sudo strace -vf venus -f /var/cache/venus.cache -d 9  2>&1 | grep mount
[pid  3223] mount("coda", "/coda", "coda", MS_MGC_VAL|MS_NOSUID|MS_NODEV|MS_NOATIME, "\1") = -1 ENOENT (No such file or directory)
[pid  3223] write(1, "19:04:39 CHILD: mount system cal"..., 5919:04:39 CHILD: mount system call failed. Killing parent.
[pid  3223] write(2, "19:04:39 CHILD: mount system cal"..., 59) = 59
$ sudo lsmod | grep coda
coda                   40295  2 
$ sudo netstat -anp | grep coda
udp        0      0 0.0.0.0:41787           0.0.0.0:*                           28810/codatunneld
udp        0      0 0.0.0.0:2432            0.0.0.0:*                           2255/codasrv    
unix  2      [ ACC ]     STREAM     LISTENING     13354783 28812/venus         /var/run/coda-client.mariner
unix  3      [ ]         SEQPACKET  CONNECTED     13353759 28810/codatunneld   
jaharkes commented 4 years ago

@stephenrkell Yes there may be a version mismatch. The Linux kernel updated timestamps to use 64-bit values on 32-bit systems, nothing changed on 64-bit because it was using long integers already. When this change was introduced the patch included changes to map back to 32-bit timestamps in the Coda API on 32-bit systems, but we decided it was better to change our interface between the kernel and the client so that we now use 64-bit everywhere just like the kernel. This also means that we're a step closer to allowing a common (32-bit) userspace to run on both 32- and 64-bit kernels.

So on a 32-bit system you do need a newer kernel (5.4+), or use dkms to build an updated module from the source at https://github.com/cmusatyalab/linux-coda/. However, nothing actually changed on 64-bit systems and the client should not have a problem running with the 6.6 kernel module.

If the mount fails like this, you can look if there are more detailed error messages from either the kernel side using sudo dmesg | tail, or the coda-client side tail /var/log/coda/venus.err.

stephenrkell commented 4 years ago

Thanks for the swift reply! I'm on a 64-bit host so no joy there. My venus.err is not very illuminating.

Date: Wed 12/04/2019

18:05:23 Coda Venus, version 7.1.0
18:05:23 /var/lib/coda/LOG size is 21246976 bytes
18:05:23 /var/lib/coda/DATA size is 84977802 bytes
18:05:23 /var/lib/coda/DATA size is 84977802 bytes
18:05:23 Loading RVM data
18:05:23 Last init was Tue Dec  3 19:04:38 2019
18:05:23 Last shutdown was dirty
18:05:24 Starting RealmDB scan
18:05:24        Found 1 realms
18:05:24 starting VDB scan
18:05:24        2 volume replicas
18:05:24        0 replicated volumes
18:05:24        0 CML entries allocated
18:05:24        0 CML entries on free-list
18:05:24 starting FSDB scan (4508, 102400) (25, 75, 4)
18:05:24        0 cache files in table (0 blocks)
18:05:24        4508 cache files on free-list
18:05:24 starting HDB scan
18:05:24        0 hdb entries in table
18:05:24        0 hdb entries on free-list
18:05:24 Mounting root volume...
18:05:24 CHILD: mount system call failed. Killing parent.

codatunneld: starting

but I'll try the latest module sources later today and report back.