drakkar-lig / walt-python-packages

Home of walt-node, walt-server, walt-client and walt-common python packages.
https://walt-project.liglab.fr
BSD 3-Clause "New" or "Revised" License
5 stars 3 forks source link

Nodes could provide a persistent data directory #18

Closed eduble closed 4 years ago

eduble commented 6 years ago

Node init-scripts could mount at /persist a writeable NFS share provided by the server for each node. That would allow nodes to save experiment data that does not fit well the concept of logs.

Users could just retrieve this data by using existing command:

walt node cp <node>:/persist/<path> <local_path>
audeoudh commented 5 years ago

Wa also need to patch walt node cp to detect this special path and retrieve data directly from the server. If we don't have that, the server will ask the nodes for the files, which will ask back the server.

audeoudh commented 5 years ago

Hum… Some simple hand-made tests seems to show that I need to install a package to mount the NFS. On debian, it's nfs-common. Can we require that on minimalist / default images?

eduble commented 5 years ago

Ideally, it should be done in the walt-init script, that is in the minimal (busybox-based) environment we have at this time. This would allow to have always it enabled, whatever is the image. The root filesystem is already an NFS mount, but the kernel deals with it alone, given the nfsroot boot parameter. I am checking if we can mount another NFS share with busybox.

eduble commented 5 years ago

I guess we will have to test it. We can use a virtual node on dev server oeno, and modify script walt-init on its mounted image directory (mount | grep waltplatform/pc-x86-64-default, file at bin/walt-init ). Inserting command sh in walt-init bootup procedure allows to easily test whatever command we need.

eduble commented 5 years ago

Note: each virtual node is running in a screen session, so by connecting to it you can easily follow the bootup mechanism. And, in this case, you should see the bootup stopped and your sh session ready for the test.

audeoudh commented 5 years ago

OK. It works. The init filesystem knows the nfs kernel module. Main instructions:

walt-server# mkdir /var/lib/walt/nodes/<node>/persist
walt-server# echo '/var/lib/walt/nodes/<node>/persist 192.168.152.0/24(rw,sync,no_root_squash,no_subtree_check)' >> /etc/exports
walt-server# exportfs -a

Then, in walt-init, just after "Mounting filesystem union" and before "Re-mounting over the union":

mkdir -p fs_union/persist
mount -t nfs <server_ip>:/var/lib/walt/nodes/<node>/persist -o rw,relatime,vers=3,nolock fs_union/persist

Then, /persist is available and correctly mounted on the node.

audeoudh commented 5 years ago

Confirmed working on a RPi.

eduble commented 5 years ago

OK. It works

Cool. :) So that means we just have to modify server source code:

eduble commented 5 years ago

Getting the mount URL in walt-init:

root@node-pc32-1:~# set -- $(cat /proc/cmdline | tr ' ' '\n' | grep nfsroot | tr '[=,]' ' ')
root@node-pc32-1:~# echo $2/../persist
192.168.172.1:/var/lib/walt/nodes/00:13:72:6d:25:87/fs/../persist
root@node-pc32-1:~#
eduble commented 4 years ago

Fixed by e1f2d9c40f3e03c66d47088f0129ec528a3abb4a.