volumio / Build

Buildscripts for Volumio System
GNU General Public License v2.0
113 stars 102 forks source link

swapon fails with "Invalid argument" #399

Closed kionez closed 3 years ago

kionez commented 4 years ago

While I'm debugging some strange issues in my PiZero installation (mpd fails to start) I've noticed a error in system's log: when detects less than 512MiB of RAM it creates a swapfile of 512MiB in /data/swapfile and tries to enable with "/sbin/swapon /data/swapfile" but it fails because /data is mounted on a overlayfs which doesn't support reading pages so the -EINVAL error and "Invalid argument" returned.

swapon: /data/swapfile: swapon failed: Invalid argument

Edit: the file involved is: volumio/bin/dynswap.sh

volumio commented 4 years ago

Interesting. This must be the result of something that changed on new kernels. Then we need to mount the swap directly on the mountpoint and not on the overlayfs. What do you think?

kionez commented 4 years ago

Yes, creating the swapfile elsewhere on a filesystem which supports it should fix it, but I don't know where.. I don't know very well how the filesystems are organized. I think /media/volumio_data/ should be a good place (ext4), but seems created only in SD > 16GiB (I know.. I have to study the install scripts before talking :) ).

kionez commented 4 years ago

[cut] but seems created only in SD > 16GiB

I dont know if I was wrong, I can't find anything in build scripts... I have a RaspberryPI3B+ with a 32GiB sdcard with mmcblk0p3 partition formatted and mounted and a RaspberryPIZero with a 16GiB sdcard without p3 partition (I've just flashed the last image and 2.692 image)... Maybe I have to dig better into code and avoid to go off-topic :)

volumio commented 4 years ago

@gkkpch what do you think? Can we mount the swap directly on the partition without mounting it to overlayfs?

gkkpch commented 4 years ago

I see no reason why not. But, as this should be generic for all boards, we cannot assume that the data partition is on /dev/mmcblk0p3. dynswap.sh should determine which partition /data is mounted on, create a new mount point on /, e.g. /swap , mount the partition there and create the swapfile.

kionez commented 4 years ago

Another idea: creating a dedicated swap partition should be faster and safer than using a swapfile but maybe it will works only on new installation because re-partitioning a running system could be a problem.

(Anyway, now I'm trying to find why mmcblk0p3 is created but not mounted on my PiZero)

macmpi commented 3 years ago

Just bumping into this as per this thread.

While swap usage is a workaround for likely Node memory over-consumption, I'm curious to understand which change/evolution broke legacy swap file allocation.

@gkkpch @volumio : Is /data construct on overlayfs relatively recent? (I kind of lost tracks in the last months) Can you point-me to the relevant changes if you remember, so I can see how I may help resolve this?

macmpi commented 3 years ago
lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0         7:0    0 333.6M  0 loop /static
mmcblk0     179:0    0  14.9G  0 disk
|-mmcblk0p2 179:2    0   2.3G  0 part /imgpart
|-mmcblk0p3 179:3    0  12.5G  0 part
`-mmcblk0p1 179:1    0    61M  0 part /boot

df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/mmcblk0p2  2.2G  1.1G  1.1G  51% /imgpart
/dev/loop0      334M  334M     0 100% /static
overlay          13G  573M   11G   5% /
devtmpfs        219M     0  219M   0% /dev
tmpfs           233M     0  233M   0% /dev/shm
tmpfs           233M  8.7M  224M   4% /run
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           233M     0  233M   0% /sys/fs/cgroup
tmpfs           233M   28K  233M   1% /tmp
tmpfs           233M     0  233M   0% /var/spool/cups
tmpfs            20M   12K   20M   1% /var/log
tmpfs           233M     0  233M   0% /var/spool/cups/tmp
/dev/mmcblk0p1   61M   59M  1.9M  97% /boot
tmpfs            47M     0   47M   0% /run/user/1000

How about moving swapfile to /imgpart instead (simple change in /bin/dynswap.sh): tested working. Maybe we'd need to reserve a bit more space for partition#2 in case...

volumio commented 3 years ago

@macmpi unfortunately this won't be possible as it will break current installations. IMHO we should understand why it fails on overlay and fix that. Any ideas?

macmpi commented 3 years ago

Thought 512MB extra burden on /imgpart could be acceptable without affecting image update process; your call.

As for swap/overlay issue: has anything changed in overlay setup since July 2017? (time reference is my last changes to dynswap scripts, which I tested OK at that time). I had similar problem in my berryboot scripts (was not overlayfs but aufs), and elected to move swap elsewhere in that environment. Thought it was an aufs-only problem at the time (Oct 2017), but it seems not now... Some clues here: this dd instead of fallocate may be tried if it's not a fundamental overlayfs issue.

To really figure-out which change broke-up things, it would need quite extensive trial/errors installs on PiZero, with images from Volumio 2.246 (July 2017 - should work?) to now. This would help narrow-down from which exact version (and therefore changes) it stated failing & identify cause(s).

Honestly, this whole swap requirement is a bad workaround to Node's memory consumption... Another way to look at it would be to better profile/manage Node memory consumption (and eventual leaks): this will be beneficial on all devices, and relax things on lowest specced ones.

volumio commented 3 years ago

My call is that an update to the kernel broke it.

BTW, it's not a workaround for NODEJS memory consumption, but rather a requirement for Volumio and how it's setup. Since we use squashfs, the whole rootfs needs to be decompressed in RAM (this is to avoid SD Card corruption), hence we have at least 400MB (the actual size of Volumio's ROOTFS) of RAM occupied just to boot the system...

macmpi commented 3 years ago

My call is that an update to the kernel broke it.

Possible, but hard to tell for sure (and accordingly workaround) without in-depth bisecting and trials with many images. July 2017 2.246 was on 4.9.x; maybe trying with 2.389 (first 4.14.x ?) and 2.657 (first 4.19.x ?) may give some initial clues?

BTW, it's not a workaround for NODEJS memory consumption, but rather a requirement for Volumio and how it's setup.

Indeed setup puts significant stress on RAM for good reasons (an Alpine base would eat-up far less ;p). But still, the fact those RAM limited devices mostly die during big db indexing, points to some suspect runtime memory management (db not progressively written to file, duplication in RAM, etc...). top shows Node consumes significant chunk of avail RAM (22%) even when idle.

macmpi commented 3 years ago

Incidently bumped into this zram-related discussion about how piCorePlayer on small ram devices benefits from it, as baseOS is on such Ramdrive. Could be a nice option to consider, now zram is back to base kernel and seem to bring good saving with small penalty

volumio commented 3 years ago

That might be a really interesting option. How would you envision the use of zram on low memory devices like pi zero in Volumio=

macmpi commented 3 years ago

I guess zram would be used for any device: all will benefit from it, and higher-end multi-core ones will have no penalty anyway. I've never used before, so new frontier for me... piCorePlayer code review can help, along with many other articles.

Now, such zram optimization may not fully remove need for swap, nor cancel need to investigate/monitor overall Node RAM usage. Maybe a separate issue could be open to investigate zram optimization?

macmpi commented 3 years ago

@kionez you may want to try the proposed fix https://github.com/volumio/Build/pull/426 Works for me.

kionez commented 3 years ago

Sorry, I don't have a Volumio installation on a PiZero anymore, I can't try it. I read the fix, AFAIK it works fine, maybe I would have used a lower value of swappiness, but it's a question of testing :) Should I close this bug report?