Closed ThomasKaiser closed 2 years ago
I'm not sure how the reset is related to the so called race condition you think you are spotting. The real issue with the reset is, if there are any programs accessing a folder a zram device was mounted to then they must be stopped prior to the zram service being stopped. If they are not or there is a phantom that is cleaning up its files while stopping the zram service the reset of the device will fail because it will still think that there is a program trying to access it even though the folder was already unmounted.
Secondly, when rebooting the zram module is not always unloaded and then reloaded. This means that if the operating system did not allow of proper cleanup of zram devices that they will still be left over on startup.
In the end I'm not really sure what you're getting at here. Adding a 0.2
second delay between trying to find devices is unlikely to accomplish anything and I don't believe that the issue is actually in finding the devices as the only real issue would be if all your RAM was already used.
Whole zram-status.log.
I ran the test with sleep 0.2
between both zramctl --find
calls between 'Fri Apr 22 20:18:09 CEST 2022' and 'Sat Apr 23 07:22:03 CEST 2022'. This resulted in 320 boot attempts and all the time both /dev/zram0
and /dev/zram1
have been successfully created. As per /etc/ztab
all 320 times /dev/zram0
was a 750MB swap device and /dev/zram1
a 150MB log partition:
NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT
/dev/zram1 lzo-rle 150M 95.5M 32.1M 44.8M 4 /opt/zram/zram1
/dev/zram0 lzo-rle 750M 4K 87B 4K 4 [SWAP]
Then I removed the sleep
statement from /usr/local/sbin/zram-config
and let another 64 boot attempts run (starting from 'Sat Apr 23 07:24:09 CEST 2022' in the log). /dev/zram0
has been created all 64 times but /dev/zram1
was created only 38 times (as such missing 26 times).
In these 64 attempts 21 times creation of the swap device failed and as such the 1st successful created zram device /dev/zram0
ended up being the 150MB log partition.
That's what I'm actually reporting: Freshly installed Ubuntu 22.04 armhf on a Raspberry Pi 4 running latest zram-config
version fails sporadically to create zram
devices (successful creation of both zram devices only in 60% of tests). A slight delay between 1st and 2nd zramctl --find
fixes this behaviour. Confirmed with +300 boot tests.
This is neither about shutdown behaviour nor 'resetting' zram devices and also not about zram devices surviving reboots (Huh?). Just zramctl --find
failing sporadically when /usr/local/sbin/zram-config start
is called at booting. Which then looks like this for example:
root@ubuntu:/home/ubuntu# systemctl status zram-config.service
● zram-config.service - zram-config
Loaded: loaded (/etc/systemd/system/zram-config.service; enabled; vendor preset: enabled)
Active: active (exited) since Sat 2022-04-23 12:02:05 CEST; 3min 9s ago
Docs: https://github.com/ecdye/zram-config/blob/main/README.md
Process: 657 ExecStart=/usr/local/sbin/zram-config start (code=exited, status=0/SUCCESS)
Main PID: 657 (code=exited, status=0/SUCCESS)
CPU: 339ms
Apr 23 12:02:05 ubuntu systemd[1]: Starting zram-config...
Apr 23 12:02:05 ubuntu systemd[1]: Started zram-config.
Apr 23 12:02:05 ubuntu zram-config[670]: zram-config start 2022-04-23-12:02:05-CEST
Apr 23 12:02:05 ubuntu zram-config[657]: createZdevice: Beginning creation of zDevice.
Apr 23 12:02:05 ubuntu zram-config[723]: zramctl: /dev/zram0: failed to reset: Device or resource busy
Apr 23 12:02:05 ubuntu zram-config[732]: createZdevice: Failed to find an open zram device. Exiting!
The zramctl: /dev/zram0: failed to reset: Device or resource busy
output is not me doing a shutdown or manually 'resetting' something but just the result of your script using RAM_DEV="$(zramctl --find --size "$DISK_SIZE" --algorithm "$ALG" | tr -dc '0-9')"
in the createZdevice
function in line 9 (so actually the very first zramctl --find
call already generates this failed to reset: Device or resource busy
message).
At 12:30 I deleted /usr/local/share/zram-config/log/zram-config.log
(too much clutter in there) and startet another run with unmodified /usr/local/sbin/zram-config
.
Stopping at 14:00 this resulted in 43 boot attempts: 43 times /dev/zram0
successfully created but only 18 times /dev/zram1
(or in other words: 60% of times one zram device creation failed).
Complete contents of /usr/local/share/zram-config/log/zram-config.log: 1st failure recorded at 3rd boot: zram-config start 2022-04-23-12:36:29-CEST
.
I now understand what you are getting at, however this is still very odd to me as it only appears to affect Jammy. I have never seen this on any other distro. I wonder if this is a bung in the particular version of zramctl
or the kernel that they use. I also wonder if perhaps it would be better to just switch back to configuring it manually without using zramctl
if that is what is causing the issue.
@ThomasKaiser I have switched to using the sysfs attributes to configure zram devices. Would you please check and see if that resolves your issue.
Giving it a try now and am reporting back in an hour...
I tested in an automated way as before and stopped after 17 successful zram
device creations.
It looked like this all the time:
root@ubuntu:/home/ubuntu# zramctl
NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT
/dev/zram2 lzo-rle 150M 83.3M 27.3M 39.4M 4 /opt/zram/zram2
/dev/zram1 lzo-rle 750M 4K 87B 4K 4 [SWAP]
In other words: all the time an additional /dev/zram0
was also created but not used:
root@ubuntu:/home/ubuntu# ll /dev/zram*
brw-rw---- 1 root disk 252, 0 Apr 25 21:43 /dev/zram0
brw-rw---- 1 root disk 252, 1 Apr 25 21:43 /dev/zram1
brw-rw---- 1 root disk 252, 2 Apr 25 21:43 /dev/zram2
Ok, that's an easy problem to solve.
Thank you! The modprobe
parameter works and zram device creation starts with /dev/zram0
now. Timing behaviour comparing zramctl
vs. sysfs
also remained the same.
Now this PR has become kinda useless so I'm simply suggesting another time an adjustment of function names and log messages since 'zswap' references are misleading when we solely deal with 'zram'. Both are entirely different things and IMO shouldn't be confused: http://ix.io/3Wh0/diff
The reason for the function naming was not mine, that is how it has been since I took over the project. I left it that way because it helps to distinguish between the different types of zram that we set up. I personally would prefer to leave it because those referenced functions are only used to set up a zram swap. Unless you have a compelling argument as to why we shouldn't I would prefer to leave it.
Quoting @StuartIanNaylor: zram & zswap are apples & pears. But a few sentences away it's confusing: 'StuartIanNaylor/zram-config does far more than just zswap'.
The only persons affected by internal function names are those interested in how stuff works and looking inside. And I don't think it's a good service to them creating the impression zram and zswap would be the same since they really aren't. But an awful lot of people do already confuse both and this should stop.
They are not the same. I agree with that, but quite frankly the difference is that zram is the underlying linux kernel name for a compressed ram block device. Whereas zswap is built upon the underlying zram to setup a swap space that uses a zram block device. I don't see a problem in this case because the function name is correct, as it is setting up a swap that uses zram. The lines are a little blurry but really zswap is just referring to zram being used as a swap space which often means the zram block device is configured a little differently because of the nature of the swap usage. And yes, I do realize that zswap is a separate kernel feature but there is AFIK very little performance difference in the RPi type application that this program targets.
As an aside, the saying that 'StuartIanNaylor/zram-config does far more than just zswap'. is really just another way of saying that zram-config allows you to configure all types of zram devices whereas his zswap program only configures swap.
EDIT:
Also I think that Stuart's reasoning when naming the function was just shortening the function name because it made it easier than zramSwap
to clarify.
Whereas zswap is built upon the underlying zram to setup a swap space that uses a zram block device
Ok, so you're just one of those persons I'm talking all the time about...
This is from a system where I switched from zram to zswap over a year ago (some commercial crap application featuring a memory leak made this necessary):
root@athene:/sys/kernel/debug/zswap# cat /proc/swaps
Filename Type Size Used Priority
/dev/sdb1 partition 16775164 2092744 -2
root@athene:/sys/kernel/debug/zswap# zramctl
root@athene:/sys/kernel/debug/zswap# grep -R .
same_filled_pages:51224
stored_pages:373418
pool_total_size:699469824
duplicate_entry:0
written_back_pages:87560
reject_compress_poor:0
reject_kmemcache_fail:0
reject_alloc_fail:0
reject_reclaim_fail:7
pool_limit_hit:44768
I even had to write my own monitoring plugin since nothing for zswap was existing.
With zswap of course no zram (device) used since it's something entirely different. Even the basic strategies differ completely: zram is avoiding 'swap to disk' (nobody uses a backing file) and zswap is 'making swap on disk more efficient'.
This VM with zram needed 8GB and locked up from time to time. Now with zswap (and restarting the crappy application once a week) we're fine with half the assigned RAM.
BTW... this sentence of yours already really scared me:
when rebooting the zram module is not always unloaded and then reloaded. This means that if the operating system did not allow of proper cleanup of zram devices that they will still be left over on startup.
And another BTW: Ubuntu Desktop 22.04 for RPi was said to ship with enabled zswap as default: https://www.cnx-software.com/2022/01/13/ubuntu-22-04-zswap-raspberry-pi-4-2gb-ram/ (you should read this also if you want to get a brief start on how zram and zswap differ).
Using zswap instead of zram on the RPi 4 might (now) make some sense as long as the device where the swap files reside is not an SD card (but an USB3 SSD or a NVMe SSD with Compute Module 4).
At least my fresh Jammy install doesn't use zswap (the zswap parameters are missing in cmdline.txt
) but good luck from now on if you will still insist that zram and zswap are more or less the same thing. They are not, they're even mutually exclusive.
Whereas zswap is built upon the underlying zram to setup a swap space that uses a zram block device
Ok, so you're just one of those persons I'm talking all the time about...
No, I understand how my comment made it seem like that, but that was just poor wording on my part. That is why I tried to edit it to clarify that I believe the Stuart originally named it that because it was shorter and simpler.
With zswap of course no zram (device) used since it's something entirely different. Even the basic strategies differ completely: zram is avoiding 'swap to disk' (nobody uses a backing file) and zswap is 'making swap on disk more efficient'.
This VM with zram needed 8GB and locked up from time to time. Now with zswap (and restarting the crappy application once a week) we're fine with half the assigned RAM.
I get it, I haven't tried zswap on a RPi in some time because of the issues you have mentioned and SD card wear. You are correct that it is probably a little confusing for the enterprising user however, at the moment it seems to be simpler to keep it the same instead of changing the function name as what you suggested in this PR would be only partially correct.
BTW... this sentence of yours already really scared me:
when rebooting the zram module is not always unloaded and then reloaded. This means that if the operating system did not allow of proper cleanup of zram devices that they will still be left over on startup.
I understand why that might scare you, I haven't seen this behavior in a while, I think it was fixed, but I have observed it before.
As reported earlier but at the wrong location I experienced strange behaviour when testing most recent
zram-config
version with freshly released Ubuntu 22.04/armhf on an Raspberry Pi 4.It seems there's some sort of a race condition when using
zramctl --find
and it clearly doesn't help when there's no delay between the twozramctl --find
calls. Quick check with asleep 0.2
in between seemed to fix it.To get more reliable data I added an automatic reboot to the install to check
zram-config
behaviour:Test with an unaltered
/usr/local/sbin/zram-config
showed the following behaviour walking through 10 boot attempts:/dev/zram0
has been created while recording twozramctl: /dev/zram0: failed to reset: Device or resource busy
errors/var/log/zram-status.log
:Adding a simple
sleep 0.1
between the twozramctl --find
calls improves things while still 6 errors occur but in 11 tests always both zram devices could be created:The 0.1 second delay between both
zramctl
calls seems to do the job. Now for some extra safety headroom testing withsleep 0.2
:Again 11 times both zram devices created and 8 times an error at the first
zramctl --find
attempt occured (most probably not related to the delay but 'result variation' since most likely happening at the 1stzram --find
call)I'll run the test for the next few hours (~30 reboots per hour) and report back.