bobafetthotmail / folder2ram

mount those folders to ram without losing access to their counterpart on disk!
GNU General Public License v3.0
110 stars 21 forks source link

Partial syncing during reboot or shutdown #19

Closed alewaste closed 2 years ago

alewaste commented 2 years ago

I'm using folder2ram on Debian10 (OMV5 with flash plug-in). I added some folders at standard configuration: some are in root folder and 2 are on /srv/dev-disk-by-uuid-*****/.

I experienced a problem during reboot or shutdown.

If I try to sync with folder2ram -syncall, sync is fully succesfully. Instead after reboot or shutdown, I found some folders aren't synced. In particulty I noticed that the last folders are affected from that problem. During shutdown process, it seems that systemd execute folder2ram and unmonting folders at the same time. So some folders umonting are marked as [FAILED].

To mitigate the problem I set a daily sync, but how can I solve this?

bobafetthotmail commented 2 years ago

Try editing the file /usr/lib/systemd/system/folder2ram_shutdown.service to become as follows


[Unit]
Description=folder2ram systemd service
Before=umount.target
DefaultDependencies=no
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/sbin/folder2ram -umountall
[Install]
WantedBy=shutdown.target reboot.target halt.target

then write this command systemctl reload daemon

for systemd to reload the config and see the change

Then try rebooting and see if you still have the issue.

This should request to systemd to run this BEFORE the unmount of system partitions

If you confirm that this solves the issue I'll update folder2ram to do this by default

alewaste commented 2 years ago

I tested, but with command systemctl reload daemon

I get this error: Failed to reload daemon.service: Unit daemon.service not found.

I tried systemctl reload folder2ram_shutdown.service but I get: "Failed to reload folder2ram_shutdown.service: Job type reload is not applicable for unit folder2ram_shutdown.service."

I tried to reboot twice, but the problem is still there.

EDIT: I solved the command error using systemctl disable folder2ram_shutdown.service before change the file and after systemctl enable folder2ram_shutdown.service, but the problem is still there.

bobafetthotmail commented 2 years ago

Ok, I will have to do tests on my own. What is the size of the folders you added?

du -hc /path/to/folder

alewaste commented 2 years ago

I give you more information, because I did others test. I tested this systemd script:

[Unit]
Description=folder2ram systemd service
Before=shutdown.target reboot.target halt.target umount.target
DefaultDependencies=no

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/sbin/folder2ram -umountall

[Install]
WantedBy=shutdown.target reboot.target halt.target

I'm using this .conf file:

#<type>     <mount point>       <options>

#Originali
tmpfs       /var/log
tmpfs       /var/tmp
tmpfs       /var/lib/openmediavault/rrd
tmpfs       /var/spool
tmpfs       /var/lib/rrdcached/
tmpfs       /var/lib/monit

#ViewPower temp folder
tmpfs       /root/viewpower/log
tmpfs       /root/viewpower/tomcat/logs
tmpfs       /root/viewpower/datas
tmpfs       /root/viewpower/config

#Redis
tmpfs       /var/lib/redis

#PHP
tmpfs       /var/lib/php/sessions

#Samba
tmpfs       /var/lib/samba

#Docker
tmpfs       /srv/dev-disk-by-uuid-49c88944-a730-4738-802f-fe66dacff34f/docker-data/volumes/portainer_data
tmpfs       /srv/dev-disk-by-uuid-49c88944-a730-4738-802f-fe66dacff34f/docker-data/containers

#NextCloud-MariaDb
tmpfs       /srv/dev-disk-by-uuid-49c88944-a730-4738-802f-fe66dacff34f/MariaDb/mysql

#Cache
tmpfs       /var/cache

Before the change (added umount.target), /var/cache and /srv/dev-disk-by-uuid-49c88944-a730-4738-802f-fe66dacff34f/ aren't syncing. Now folder2ram, it's syncing all root folders, but all folders in /srv/dev-disk-by-uuid-49c88944-a730-4738-802f-fe66dacff34f/ aren't synced during reboot/shutdown.

That folders are created with openmediavault on a mdadm raid5. So is it possible that are umounted by another service?

Anyway, here the size of the folder /srv/dev-disk-by-uuid-49c88944-a730-4738-802f-fe66dacff34f/MariaDb/mysql:

4.0K    /srv/dev-disk-by-uuid-49c88944-a730-4738-802f-fe66dacff34f/MariaDb/mysql/performance_schema
236M    /srv/dev-disk-by-uuid-49c88944-a730-4738-802f-fe66dacff34f/MariaDb/mysql/nextcloud
1.5M    /srv/dev-disk-by-uuid-49c88944-a730-4738-802f-fe66dacff34f/MariaDb/mysql/mysql
421M    /srv/dev-disk-by-uuid-49c88944-a730-4738-802f-fe66dacff34f/MariaDb/mysql
421M    total

Size of docker folders is less 10Mb.

EDIT: I did other tests, but now again last 4 folders doesn't sync

alewaste commented 2 years ago

I did other tests on a virtual machine. I confirm the problem. It's not related to different path on other hd as I thinked before.

I used only standard paths plus some others in root filesystem. I created /var/lib/mysql with the size of about 500Mb and the problem is appeared now and then. Increasing a litlle bit at about 600Mb, the problem is always there. It's seems systemd starts to umount all filesystem before folder2ram finishes his job. So increasing folders size, increase time of execution and all filesystem is umonted before folder2ram has synced all folders.

The behaviour isn't easy to predict. I saw it changes, if I change folder order too. Was you able to reproduct this error?

bobafetthotmail commented 2 years ago

Yes I reproduced the problem, thanks for the information you provided. I tested with a 1.3GB file and I saw the problem too.

I think I found a fix.

Please edit /usr/lib/systemd/system/folder2ram_shutdown.service

[Unit]
Description=folder2ram systemd service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStop=/sbin/folder2ram -umountall

[Install]
WantedBy=multi-user.target

And then write

systemctl daemon-reload

and then make sure the service is started with

systemctl status folder2ram_shutdown.service

● folder2ram_shutdown.service - folder2ram systemd service
     Loaded: loaded (/lib/systemd/system/folder2ram_shutdown.service; enabled; vendor preset: enabled)
     Active: active (exited) since Sat 2021-11-13 15:35:55 CET; 1min 1s ago

nov 13 15:35:55 albyVPN systemd[1]: Finished folder2ram systemd service.

If it is not started, start it

systemctl start folder2ram_shutdown.service

Now if you reboot it should sync properly and if you put a big file you can notice reboot will stop for some time to wait for folder2ram to finish syncing.

If you confirm this fixes the issue I'm implementing this change and making a new version so OpenMediaVault can pull it and make a new package

alewaste commented 2 years ago

I tested new code. It works if I use folders inside root filesystem. I put about 3 Gb of data without problems. I only added

TimeoutSec=infinity

because timeout was 1m and 30s.

Now the problem remains, if I use a folder outside root filesystem like /srv/dev-disk-by-uuid-49c88944-a730-4738-802f-fe66dacff34f/MariaDb/mysql.

I activated journal persistance and I get:

-- Logs begin at Sat 2021-11-13 22:22:18 CET, end at Sun 2021-11-14 01:12:25 CET. --
Nov 14 01:06:38 NAS-Test systemd[1]: Started folder2ram shutdown systemd service.
Nov 14 01:09:47 NAS-Test folder2ram[3511]: will now stop all mountpoints
Nov 14 01:09:47 NAS-Test systemd[1]: Stopping folder2ram shutdown systemd service...
Nov 14 01:09:47 NAS-Test folder2ram[3511]: stop /var/tmp
Nov 14 01:09:48 NAS-Test folder2ram[3511]: stop /var/lib/openmediavault/rrd
Nov 14 01:09:48 NAS-Test folder2ram[3511]: stop /var/spool
Nov 14 01:09:48 NAS-Test folder2ram[3511]: stop /var/lib/rrdcached
Nov 14 01:09:50 NAS-Test folder2ram[3511]: stop /var/lib/monit
Nov 14 01:09:50 NAS-Test folder2ram[3511]: stop /var/lib/redis
Nov 14 01:09:50 NAS-Test folder2ram[3511]: stop /var/lib/php/sessions
Nov 14 01:09:50 NAS-Test folder2ram[3511]: stop /srv/dev-disk-by-uuid-5fc3f1ae-1eea-4738-80b4-5ff79114e057/MariaDb/mysql
Nov 14 01:10:21 NAS-Test folder2ram[3511]: umount: /var/folder2ram/srv/dev-disk-by-uuid-5fc3f1ae-1eea-4738-80b4-5ff79114e057/MariaDb/mysql: not mounted.
Nov 14 01:10:21 NAS-Test folder2ram[3511]: stop /var/lib/samba
Nov 14 01:10:22 NAS-Test folder2ram[3511]: stop /var/cache
Nov 14 01:10:22 NAS-Test systemd[1]: folder2ram_shutdown.service: Succeeded.
Nov 14 01:10:22 NAS-Test systemd[1]: Stopped folder2ram shutdown systemd service.

During shutdown process /var/folder2ram/srv/dev-disk-by-uuid-5fc3f1ae-1eea-4738-80b4-5ff79114e057/MariaDb/mysql is umounted too early.

Twice I get:

Nov 13 22:31:46 NAS-Test folder2ram[3672]: stop /srv/dev-disk-by-uuid-5fc3f1ae-1eea-4738-80b4-5ff79114e057/MariaDb/mysql
Nov 13 22:32:06 NAS-Test folder2ram[3672]: rsync: write failed on "/var/folder2ram/srv/dev-disk-by-uuid-5fc3f1ae-1eea-4738-80b4-5ff79114e057/MariaDb/mysql/CD.rar"
Nov 13 22:32:06 NAS-Test folder2ram[3672]: rsync error: error in file IO (code 11) at receiver.c(374) [receiver=3.1.3]

Cd.rar is one of the files I added.

bobafetthotmail commented 2 years ago

I only added TimeoutSec=infinity because timeout was 1m and 30s.

Ok, you have more data than many users so the default timeout of systemd is too small. I think "infinity" is not a good default. If there are problems it will cause permanent lockup on shutdown. I need to have that as an option in the config, and let the end user set his timeout. If you really have a lot of data you can put a big number or an "infinity" and accept the risks while most users are not risking a lockup.

Now the problem remains, if I use a folder outside root filesystem like /srv/dev-disk-by-uuid-49c88944-a730-4738-802f-fe66dacff34f/MariaDb/mysql.

Hm, it is the "folders are created with openmediavault on a mdadm raid5" as you said above. I will have to check and see how OMV mounts the storage arrays then.

You can try modifying the /usr/lib/systemd/system/folder2ram_shutdown.service like this, using the "RequiresMountsFor". This can work if OMV is using systemd to manage its mount points.


[Unit]
Description=folder2ram systemd service
RequiresMountsFor=/srv/dev-disk-by-uuid-5fc3f1ae-1eea-4738-80b4-5ff79114e057

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStop=/sbin/folder2ram -umountall

[Install]
WantedBy=multi-user.target
alewaste commented 2 years ago

Ok, you have more data than many users so the default timeout of systemd is too small.

Now I'm testing on a virtual machine, to avoid problems with the real one. So I put more data only for a test now. In my real machine, I think I don't need to increase timeout.

I tested RequiresMountsFor=/srv/dev-disk-by-uuid-5fc3f1ae-1eea-4738-80b4-5ff79114e057 but doesn't work.

I did many other tests to understand this issue. Now it isn't related to the size of folders, because I I emptied them, I left few files. I was wrong when I said it's related to folders outside root filesystem. Using virtual machine, I was able to add a single drive and mounted with OMV: the problem doesn't appear on that drive.

I tried to mount "manually" through fstab and where I had mdadm raid I had the problem too.

I tried to anticipate folder2ram shutdown service using WantedBy=graphical.target instead WantedBy=multi-user.target. Folder2ram shutdown starts very early in this way, but I saw an "Unmounted /var/folder2ram/srv/dev-disk-by-uuid-5fc3f1ae-1eea-4738-80b4-5ff79114e057/MariaDb/mysql" message always before.

I think there is another service that stops only "/var/folder2ram/srv/dev-disk-by-uuid-5fc3f1ae-1eea-4738-80b4-5ff79114e057/MariaDb/mysql" before folder2ram, but I didn't understand which one.

alewaste commented 2 years ago

I did more test with a clean installation of Debian without OMV.

I reproduce the size folder issue with the old code. I solved that with your new code:

[Unit]
Description=folder2ram systemd service

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStop=/sbin/folder2ram -umountall

[Install]
WantedBy=multi-user.target

I can't reproduce the problem with folders outside the root filesystem, either in single drive than in mdadm array.

Then I installed OMV over Debian and the problems appears how I described above. So this last issue is related with OMV and the new service code solves the size issue.

Now I have to understand how solve with OMV.

alewaste commented 2 years ago

I finally found the guilty! It's blk-availability.service! Disabling that service, folder2ram works as expected. Now I have to understand its role in OMV and how to avoid this conflict.

If I stop this service from the console, I have the same problem too: it unmounts the folders I said above (/var/folder2path/my-path). After that, if I execute "folder2ram -umountall", I get the same error of shutdown process.

alewaste commented 2 years ago

I have found another little bug with the change of systemd shutdown service. At the first installation, after execute "folder2ram -enablesystemd", the service is stopped. So it's necessary to start it manually to avoid issue in the first shutdown (it doesn't umount folders).

alewaste commented 2 years ago

It seems a known bug, like here. A workaround is add "After=blk-availability.service" at the shutdown service. I tested and it seems work.

An alternative approach is add "Before=local-fs-pre.target" for blk-availability.service. This seems work as well.

The only thing that remains: if I stops that service in a running machine, it breaks folder2ram.

bobafetthotmail commented 2 years ago

A workaround is add "After=blk-availability.service" at the shutdown service. I tested and it seems work.

Seems to work here too, so I added this to the shutdown service

At the first installation, after execute "folder2ram -enablesystemd", the service is stopped.

This is intended. The "move to tmpfs folder" operation is best done on system startup, BEFORE all applications are using files in that folder. I added a message to explain this in the -enablesystemd command. The recommended action is to reboot after that command, but if you want you can start the services manually.

The only thing that remains: if I stops that service in a running machine, it breaks folder2ram.

that service seems to be an "always on" service, so it's probably not a big issue.

I have also added a timeout setting, see here in the default config file https://github.com/bobafetthotmail/folder2ram/blob/master/debian_package/sbin/folder2ram#L465 Since you are using OMV with flash plugin your config file will not have that, you can just add the line #TIMEOUT=2m at the end of the file (and adjust the time from 2m to whatever minutes you want) Then as said in the comment in the original file, you should refresh the systemd unit files with folder2ram -enablesystemd

alewaste commented 2 years ago

that service seems to be an "always on" service, so it's probably not a big issue.

You are right, it's a "always on" service like folder2ram_shutdown. I found a solution too, but I'm not sure there aren't other side effects. I added "After=blk-availability.service" and "BindsTo=blk-availability.service". When I stop blk-availability.service, folder2ram_shutdown stops before blk-availability.service. In this way, blk-availability.service can perform his script completely and deactivate completely the mdadm array. So folder2ram can exit gracefully, but I had to reassemble mdamd array and remount it.

This is intended. The "move to tmpfs folder" operation is best done on system startup, BEFORE all applications are using files in that folder. I added a message to explain this in the -enablesystemd command.

This let me a doubt: If it's better to start folder2ram BEFORE all applications, at shutdown it's better stop after them. So could be better change folder2ram_shutdown wanted by basic.target?

I have also added a timeout setting, see here in the default config file

Thank you. Before these modifications, I thinked that timeout infinity is a wanted behavior. Now I know it was a side effect of "DefaultDependencies=no"

alewaste commented 2 years ago

I did few tests using basic.target and sometimes I had some strange behavior. It's safer multi-user target. I think we can closed this issue.

bobafetthotmail commented 2 years ago

I found a solution too, but I'm not sure there aren't other side effects. I added "After=blk-availability.service" and "BindsTo=blk-availability.service".

thanks, will add this too

This let me a doubt: If it's better to start folder2ram BEFORE all applications, at shutdown it's better stop after them.

Yes, this is better in theory. In practice I had issues like you also see. This is why I use 2 systemd units, one for startup and one for shutdown. So I can use two different targets and conditions.

Most applications will receive the stop command at the same time as folder2ram (since most services will be on multi-user.target) because systemd runs parallel processes, so this should not cause problems for those services.