MichaIng / DietPi

Lightweight justice for your single-board computer!
https://dietpi.com/
GNU General Public License v2.0
4.91k stars 501 forks source link

DietPi-Software | Box86/Box64: Assure memory requirements are met #5483

Open lukaszsobala opened 2 years ago

lukaszsobala commented 2 years ago

Required Information

Additional Information (if applicable)

Steps to reproduce

  1. Select Box86 in dietpi-software on a 512 MB RAM device with zram enabled (1/2 memory)
  2. Select "Install software"

Expected behaviour

Actual behaviour

Extra details

Maybe a warning about the RAM requirements could be added to the installation somehow.

MichaIng commented 2 years ago

Thanks for reporting.

Hmm, not sure how to solve best. In other cases we raise swap size when install require more RAM than available, but with zram this has a limit, of course. Something we should take into account for other installs as well.

Actually an additional temporary swap file would be an idea, also not a bad idea for other cases so it is not persistent but the users previous swap size choice stays what is was. Just a question where to create it, preferably on USB drive when available.

lukaszsobala commented 2 years ago

Hello,

yes, this is a good idea. I will try to attach an additional 4 GB swapfile (on the SD card, as it is a pi zero) and see if it fails again. Box64 compilation peaks at about 2 GB of memory from what I observed.

This seems to be an edge usage case anyway!

MichaIng commented 2 years ago

This seems to be an edge usage case anyway!

The RPi Zero 2 W is the first SBC with such little RAM but sufficient CPU power and right architecture to allow running Box86 reasonably. So at least it is a newer case 🙂. But as said, it anyway makes sense to have some good solution for insufficient RAM on installs (for compiling) that at best doesn't persistently change the system setup and is reliable.

lukaszsobala commented 2 years ago

Indeed, with an additional 4 GB of swap space the compilation succeeded. It took more than 8 hours, however - and I'm not sure about using an SD card for lots of random writes like this (WD Purple cards have wear leveling so mine should be OK).

MichaIng commented 2 years ago

8 hours on RPi Zero 2 W? I wound have expected much less 🤔. Indeed, when we generate a new swap file for RAM intense builds, we should at least prefer to generate it on a USB drive instead of SD card.

lukaszsobala commented 2 years ago

Yeah, I think it takes so long mainly because of the heavy swapping needed. The USB port on RPi Zero 2 W is only USB 2.0 so it probably wouldn't improve the performance by much (I might be wrong).

One might try to also benchmark how the number of workers affects the compilation time and the RAM usage. Intuitively, reducing RAM usage=reducing swapping=better performance, but who knows?

MichaIng commented 2 years ago

Yes, reducing the build jobs should be done first, before creating a swap file. So we need to find out how much RAM the build requires with 1 job.

Joulinar commented 2 years ago

I did some testing on a RPi3B+ as I don't have a Zero 2 W. (may bad 😄 )

Running with a single process is using around 500 MB memory. But I need to say, it was an empty system. Just telegraf (80 MB ram) was running to get some monitoring data. Runtime (including install of all dependency) was around 40 minutes.

below the link to the monitoring snapshot (valid for 7 days) https://snapshots.raintank.io/dashboard/snapshot/v5JOX8zb7993Qy3AB66lgBOL72J6UC6m

Using all 4 process result in a total memory consumption of 1.5 GB ram + swap.

MichaIng commented 2 years ago

Okay, so we should assure 1 GiB RAM then and calculate with 512 MiB + 512 MiB per job to know when to reduce the number of jobs. This is the very similar PaperMC if I recall correctly. We should create a function which takes base + per job memory consumption as input and returns the number of jobs + in case applies a temporary swap file when required for a single job as well.

lukaszsobala commented 2 years ago

This looks like a great algorithm, thank you!!

I'll check it again on RPi Zero 2 W and send my result here. I'm trying to make a lightweight script to log memory usage to a file, so far I came up with: watch -n 10 'free|tail -n 2|tr "\n" "\t" >>somefile.log && printf "\n" >>somefile.log' which seems kinda stupid but works 😁 (Can't find telegraf in the repositories and it seems too RAM-heavy for the meager Zero 2).

MichaIng commented 2 years ago

Yes indeed Telegraf is probably a bit heavy with 512 MiB RAM only. It requires a TIG stack, i.e. InfluxDB and Grafana as time series database and web UI. When those installed via dietpi-software, then Telegraf can be installed via apt install telegraf and configured to use InfluxDB (I think it does OOTB) and used in Grafana to show system stats nicely. I don't remember exactly, but there was a template for system stats in Grafana, wasn't it?

Joulinar commented 2 years ago

There you go https://medium.com/@dorian599/iot-raspberry-pi-container-and-system-monitoring-with-influxdb-telegraf-and-grafana-a1767c38c109

At least running grafana+influxdb+telegraf would be quite some overhead. Usually, inside my testing lab, I move grafana+influxdb to a VM or different SBC to avoid this.

lukaszsobala commented 2 years ago

Hi! I recompiled box86 with one worker and additional swap space on the pizero2w. I used the command above (executed inside screen) for monitoring. It was much quicker (<60 min) and the max memory usage was actually only 101% of my original zram (216 MiB vs 213 MiB). Here is the graph:

box86_compile_pizero2w

Since I am not sure which options dietpi-software is selecting I used the ones for "other" hardware: cmake .. -DARM_DYNAREC=ON -DNOGIT=1 -DCMAKE_BUILD_TYPE=RelWithDebInfo and make CFLAGS='-g0'.

MichaIng commented 2 years ago

On RPi Zero 2 W we use the RPi 3 flags (same SoC):

cmake .. -DRPI3=1 -DNOGIT=1 -DCMAKE_BUILD_TYPE=RelWithDebInfo
make CFLAGS='-g0 -O3' "-j$(nproc)"
strip --remove-section=.comment --remove-section=.note box86
make install
lukaszsobala commented 2 years ago

OK, so I recompiled it with:

cmake .. -DRPI3=1 -DNOGIT=1 -DCMAKE_BUILD_TYPE=RelWithDebInfo
make CFLAGS='-g0'

here is the result: box86_compile_pizero2w_2

I modified the zram from the default 213 MiB like so: /boot/dietpi/func/dietpi-set_swapfile 384 /dev/zram0 and didn't use any additional swap space. The compilation time was also reduced to ~52 min. My max zram usage was 291 MiB (so there is some headroom with this setting) but it might have been a bit more because I recorded every 5 seconds. So creating a large zram definitely helps!

MichaIng commented 2 years ago

Thanks for the results, so 1024 MiB RAM + swap seems like a good failsafe minimum value for 1 Job, including zRAM als swap space which contributes roughly 50% of it's usage to RAM.

lukaszsobala commented 2 years ago

Hello,

I'm wondering if this is fixed? I could close the issue.

Cheers!

MichaIng commented 2 years ago

Ah, I didn't change something on the Box84/64 installs yet. Let's keep this open, so I won't forget.

lukaszsobala commented 1 year ago

Hi!

I just tried to install the new box86 (v0.3.0) and the build failed - had to remove the -O3 -j4 flags, and only then the build succeeded. I also have the max amount of zram set:

$ zramctl
NAME       ALGORITHM DISKSIZE  DATA COMPR TOTAL STREAMS MOUNTPOINT
/dev/zram0 lzo-rle       424M 59.9M 24.2M 25.7M       4 [SWAP]

This zram is probably too big - it's what it looked like while compiling. But yes, I can see this is the milestone for v8.16 now :)

Oh, at 34% I got:

[ 34%] Built target PRINTER
[ 34%] Generating ../src/git_head.h
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
Scanning dependencies of target box86

but this is probably because you are using the nogit version.