Closed idnovic closed 5 years ago
Better idea: Above question still stands but also how to I reset precached completely? Maybe something got in the way since I run the dev version since a few days now.
I found that precached uses 9-20% cpu atm. The cpu usage was not as high before.
Maybe it is optimizing trace logs in the background? Can you reproduce this with log_level set to debug and show some logs?
What command can I run to find out what precached is calculating?
You can use perf top
to investigate what functions are the main culprit. Since you are running the release build of precached (without debug symbols) you may find perf trace
better suited. It will show you what system calls are executed by precached.
$ sudo perf top -p $(pgrep precached)
$ sudo perf trace -p $(pgrep precached)
how to I reset precached completely?
To completely reset the state of precached you may run these commands:
$ sudo systemctl stop precached
$ sudo iotracectl clear
$ sudo precachedctl plugins hot-applications optimize
the screenshot is the first command and the log is just 10 seconds of the second command
Thank you for investigating this! The screenshot looks interesting indeed!
Looks like an issue in a dependency of precached (sys_info
crate). Let's see whether I can reproduce this...
I deleted now the state of precached as you guided above. Still need to do a restart
So it seems that you need to revert the systemd private tmp settings. After I cleared precached and restarted no preloading was done. I reverted the config and now the preloading works.
I do not understand. Why precached can not access its own tmp dir?
It seems that it was able to access the old files maybe even write to them but without existing files it did not work.
So it seems that you need to revert the systemd private tmp settings. After I cleared precached and restarted no preloading was done. I reverted the config and now the preloading works.
OK, I reverted the change!
I do not understand. Why precached can not access its own tmp dir?
I will investigate this, trying to re-enable all systemd protection features that are applicable.
Happend again. Newest precached version and I set it to use the precached user.
I can tell now that it has nothing to do with /app/*/
I forgot that precached runs as its own user now. I need to use sudo with precachedtop. I removed the last sentence of the above comment about precachedtop not working.
Something new happened. Precached uses now 25% CPU directly after power up. Yesterday I stopped the precached service because of the high CPU use. Today precached seems to hick up at the same point.
Does this mean a single trace log is responsible?
This looks like repeat and repeat. I currently can not use precached at all. CPU usage is after start up at 25% and not going anywhere.
I will reset it again later. Edit: After I created log files.
reproduce this with log_level set to debug
How can I set the log level to debug?
Sorry, for the inconvenience, I will prepare a new copr build of precached today!
To set the log level to debug
or trace
, please edit the file log4rs.yaml
sudoedit /etc/precached/log4rs.yaml
Please find line 30, and change it to:
level: debug
or level: trace
for even more log output.
I just pushed precached-1.7.1-0
, the Copr build should be finished soon!
Please note that this build ships with a pretty invasive change to the memory accounting. To fully support ZRAM, swap is now included into the calculations (added to the installed RAM size).
As a consequence, one needs to re-tune the parameters set in precached.conf
. On my development laptop with 4GiB (3.71 GiB usable) of RAM and 4.72 GiB of swap space (ZRAM and swap partition) I am now using these values:
# Memory management settings
available_mem_critical_threshold=45
available_mem_upper_threshold=40
available_mem_lower_threshold=25
Precached now performs the following calculations:
total_memory = ram_size + swap_size
mem_used = (ram_size - ram_used) + swap_used
percentage = mem_used * 100 / total_memory
So if you have 4 GiB of RAM and approx. 4GiB of swap you just need to halve your available_mem_*
percentage settings, to account for the changes in precached-1.7.1
.
I want to try the log file thing in the next hour. Where do I find the log file after I set the setting to debug?
Edit, found it.
# logfile:
# kind: file
# path: "/var/log/precached.log"
# encoder:
# pattern: "{d(%Y-%m-%d %H:%M:%S %Z)} [{l}]: {M}:{L} - {m}{n}"
You will find it in the systemd journal. You may use this command to dump it to a file:
$ sudo journalctl -b -u precached > precached-debug-$(date --iso).log
I will use journald. Good point. Thank you.
Only works with precached.service instead of just precached for me but you will have your log in a moment. 26mb.
precached-debug-log-2019-07-11.log.zip
I will reset precached now and update it to the newest version including necessary config changes.
I do not know if this is related but I have the following set
vm.swappiness=85
vm.vfs_cache_pressure=75
vm.dirty_background_ratio = 3
vm.dirty_ratio = 7
vm.overcommit_memory = 2
vm.overcommit_ratio=75
vm.dirty_background_bytes=16777216
vm.dirty_bytes=16777216
I wanted to start gnome-twitch (twitch.tv player) during my current test and I was not able to. Error was:
FATAL: Could not allocate gigacage memory with maxAlignment = 34359738368, totalSize = 103079215104.
Workaround GIGACAGE_ENABLED=no
environment variable.
It is possible that precached faced the same error. The responsible sysctl settings are:
vm.overcommit_ratio =
vm.overcommit_memory =
I found a bug report for atril, the document viewer of mate desktop. Same error but it happened because atril wanted to mmap 128GB of memory (the system had 16GB).
The above settings for sysctl are a trial for my memory limited laptop. I think you have a better understanding if precached needs memory overcommit and if my above settings are interfering.
Regarding your virtual memory subsystem configuration:
vm.swappiness=85
vm.vfs_cache_pressure=75
vm.dirty_background_ratio = 3
vm.dirty_ratio = 7
vm.overcommit_memory = 2
vm.overcommit_ratio=75
vm.dirty_background_bytes=16777216
vm.dirty_bytes=16777216
This looks reasonable except for:
vm.overcommit_memory = 2 # <- will lead to application crashes
I don't think that this will work out as intended! You are risking lots of software to fail. Overcommit is basically required by every process that calls fork()
. If the process has a reasonably large vmsz (size of virtual memory allocated address space), it will crash without Linux being able to overcommit memory. Additionally, most memory allocators have a large amount of overhead, in that they consume huge amounts of address space (not memory though). The Java VM may serve as an example here.
Precached therefore needs overcommit memory enabled too. The precached process' vm size, directly after re-booting is currently larger than my installed RAM size.
Okay I will change this to
vm.overcommit_memory = 0
An explanation for the bug could be that precached did not unload precached files for me before 1.7.1 and grew bigger and bigger(like java). And at a certain point it was not able to allocate enough memory. But I can not see a related output of this. Precached did not seem to know what wrong is.
Precached allocated up to 1.7GB ram before. Never over 2GB. So it did work with more than 1GB.
Maybe add a memory information profiler for precached. This could help with error reports. Just collect important information at startup and put it inside the log file of precached.
CommitLimit = (total_RAM - total_huge_TLB) *
overcommit_ratio / 100 + total_swap
So this is how linux calculates the over commit in mode 2. I will try to increase this first before I try to change it back to 0. In mode 0 the oom killer gets used. I wanted to avoid that.
In mode 0, calls of mmap(2) with MAP_NORESERVE are not
checked, and the default check is very weak, leading to the
risk of getting a process "OOM-killed".
This does not sound very useful. Maybe when my memory is full I should not be able to launch more stuff instead. I am not sure. I do not have programs running I want to get killed.
vm.overcommit_ratio=75
So far I found out that I need to set this to at least 100 while I have mode 2 set up. Currently, I am virtually shrinking my available memory. Also GIGACAGE_ENABLED
seems security related so I am not going to use the work around.
I changed it back to mode 0 (default). vm.overcommit_ratio does not apply in mode 0. Did you find anything useful in the log files I uploaded?
I changed it back to mode 0 (default)
Yes, that's the only reasonable choice for desktop systems.
Did you find anything useful in the log files I uploaded?
I did not find anything useful, so it is unclear what exactly happened during the load spikes.
I believe you can close this issue now. I do not have problems with the version 1.7.1
I found that precached uses 9-20% cpu atm. The cpu usage was not as high before.
What command can I run to find out what precached is calculating? I stopped the process atm but I want to investigate it.