Closed alspitz closed 2 years ago
If the blocking is troubling you, wouldn't it be the better choice to add a timeout option to filereader::read_all, and cache the last value in-between?
All that assuming that reading this file can be performed without blocking after some time. Could you test this please?
Can it be done without introducing a new option?
Which files take a long time to read? Can you do:
for f in /sys/class/thermal/*/temp; do time grep '' $f; done
apppstatus.cc avoids reopening the same file and uses lseek. Would that help?
The question is if the read takes so much time, or the opening and closing of the file.
Could you post images of your extended graph?
Using lseek
doesn't seem to speed up the read.
Running this Python code:
import os, random, time
import numpy as np
p = "/sys/class/thermal"
dat = { fn : (open(os.path.join(p, fn, "temp")),
open(os.path.join(p, fn, "type")).read().strip(),
[])
for fn in os.listdir(p) if "thermal" in fn}
for i in range(50):
for fn in random.sample(list(dat.keys()), len(dat)):
fd = dat[fn][0].fileno()
os.lseek(fd, 0, os.SEEK_SET)
st = time.time()
temp = os.read(fd, 10)
dt = time.time() - st
dat[fn][2].append(dt)
for fn in sorted(dat.keys()):
dts = dat[fn][2]
print("%s (%s)\n\tmean: %f +- %f, max: %f" % (fn, dat[fn][1], np.mean(dts), np.std(dts), np.max(dts)))
results in
thermal_zone0 (acpitz)
mean: 0.001641 +- 0.000296, max: 0.002463
thermal_zone1 (pch_skylake)
mean: 0.000021 +- 0.000007, max: 0.000034
thermal_zone2 (iwlwifi_1)
mean: 0.146807 +- 0.088839, max: 0.205067
thermal_zone3 (x86_pkg_temp)
mean: 0.000037 +- 0.000024, max: 0.000127
which is very similar to what happens if the file is open and closed each time.
I'm not sure how to not display iwlwifi_1
without a new option without adding a lot of complexity, especially since sometimes, the read happens quickly. If there is a read timeout, what should the timeout be? Any block will still slow down the window manager, so I'd rather not add a timeout.
The ignore option is nice because I don't particularly care about the wifi card temperature since it doesn't trend with CPU usage as the CPU temp does. The ignore option also seems consistent with the ignore batteries option. Each system will potentially have a different thermal zone that may possibly be slow to read, so having to configure it seems okay. The only issue I see is that someone may enable the temperature in the graph option and not realize that their window manager is stuttering due to the slow read and thus never add the appropriate ignore option.
An automatic solution may be something where if any temp read ever takes more than some set time (dependent on refresh rate?), that thermal zone is excluded from the thermal zones that are read, with perhaps some way to get back into the "good" thermal zones club with good behavior.
Or some asynchronous read would be ideal, but not sure how well it can be added to the current architecture.
But at least the current pull request can only help the situation, at the cost of an extra option.
Here are some images of the CPU graph I use, with a width of 300 samples:
and the default width of 20:
Currently, it has 3 temperatures, because I'm ignoring
iwlwifi_1
. Incidentally, it doesn't look too bad with the default width of 20 samples, but only one temp is shown, and it's effectively chosen at random (according to the filesystem directory listing).
Very good! You can simply always ignore anything containing wifi
or starts with iw
.
We only care about CPU temperature. The first examination should check their type
and give priority to x86_pkg_temp
, or else ^pch
. The acpitz
is less interesting.
Always reading all those files is wasteful. One or two should suffice.
It's definitely possible to exclude based on wifi
or only include those matching x86_pkg_temp
or ^pch
, but I'm worried that would exclude non-Intel systems. A small number of other net drivers also seems to register a thermal zone and I imagine e.g. ARM-based systems won't have x86_pkg_temp
. The additional option would allow anyone to make it work for them regardless of their system without changing the code.
Some options are:
wifi
and potentially prioritize x86
or pch
for the temperature in the graph (and tooltip as well? Could show all in tooltip and only one in graph).x86_pkg_temp
, acpitz
, and pch_skylake
in my preferences
) and use all by default./sys/class/hwmon
appears to be the "new" way to do this and gives more temperatures, with labels, and other info. E.g. psutil
uses this: https://github.com/giampaolo/psutil/blob/master/psutil/_pslinux.py#L1316What do you think?
By "first examination" did you mean to run this check once on startup (and just read temp
later) or each time the temperature is needed? Reading the files should ideally be fast (well iwlwifi
excluded), since they are not in the filesystem. In any case, it can all probably be done using open once, save FD, and lseek
, instead of open each time.
We only care about the CPU type we can test now. We don't need or want any option for this ever. Most users never use this feature. The ones that do have only a default size graph. Take this order: x86_pkgtemp, ^pch, acpitz. Ignore any other module. Ignore other software like libsensors, because that creates dependencies and this is a minor feature.
An alternative is the hwmon coretemp module.
grep '' /sys/class/hwmon/*/*
That gives a temperature for the package, plus per core. There can be more than one package, but rarely so. In theory, there can be many cores. Maybe read the temperature at a slower rate.
Hardcoding these will be less maintainable than making it customizable via an option, imo. The CPU and net graphs in the taskbar is one of the main reasons I use IceWM and having it be customizable is important to me. But I realize most users don't use them in the way I do.
Any particular reason adding an option is not recommended?
In any case, the slowdown due to iwlwifi
affects all users using temp in graph regardless of its width, so I've pushed #635
Please let me know what you think.
Thanks for considering!
Don't change the string for the centigrade symbol, because then all the translations fail.
The temp buffer may not be initialized when it immediately returns.
You can customize to infinity. That is called programming. But it must also be maintainable forever. Optimize the ratio of (the number of users the time they use it) / ((the amount of code + documentation10) * the period to maintain it).
By avoiding new options we have to be smarter and relieve the user of all this boring configuration and documentation of teeny weeny details.
If TaskBarCPUSamples<=20 we use just one value. The one for x86_pkgtemp, if it exists, otherwise ^pch, otherwise acpitz. If TaskBarCPUSamples > 20 you can add more values, as long as they fit.
The thermal zones are now sorted by priority, but it's not the cleanest.
Do you mean the temp
in temperature
? It shouldn't be used if the function returns since it's a local var.
char temp[50];
int len = getAcpiTemp(temp, sizeof(temp));
Here getAcpiTemp is not guaranteed to init temp?
We only want to read defined tzs, since any other may be a device like iwlwifi and hence be much slower. Hence only accept strictly positive prios > 0. We don't need slow strstr, but just consider prefixes.
char tzPriority(const char* tz_type) {
const char* tzs[] = { "x86_pkg_temp", "pch_", "acpitz" };
for (int i = 0; i < 3; ++i) {
if (strncmp(tz_type, tzs[i], strlen(tzs[i])) == 0)
return 4 - i;
}
return 0;
}
Are these priorities read just once on startup? That would be justified.
What bothers me is that much more is read than is used. Could we read just what is actually going to be shown?
You may consider reading the temperature at a slower rate, like once every two seconds. Would be efficient.
A design where this is encapsulated could look like:
class YTemp : private YTimerListener {
bool handleTimer(YTimer *timer) {
flush_data();
return true;
}
char buf[24];
public:
YTemp() {
read_types();
timer->setTimer(2000, this, true);
}
char* operator[](int index) {
if (data[index] == invalid)
read_data(index);
snprintf(buf, 24, "%f %s", data[index], centigrade);
return buf;
}
}
Then data for each thermometer would be read at most once per two seconds, and only if it wasn't yet cached.
char temp[50]; int len = getAcpiTemp(temp, sizeof(temp));
Here getAcpiTemp is not guaranteed to init temp?
I see, okay to be safe (e.g. if drawChars accesses temp even if len is zero) I am now returning early if len is zero.
We only want to read defined tzs, since any other may be a device like iwlwifi and hence be much slower. Hence only accept strictly positive prios > 0. We don't need slow strstr, but just consider prefixes.
char tzPriority(const char* tz_type) { const char* tzs[] = { "x86_pkg_temp", "pch_", "acpitz" }; for (int i = 0; i < 3; ++i) { if (strncmp(tz_type, tzs[i], strlen(tzs[i])) == 0) return 4 - i; } return 0; }
I like the loop and strncmp implementation; using that now. But I don't like that idea of hardcoding a whitelist. Currently, the hardcoded entries are only for prioritization and the exclusion of iwlwifi. If we only allow hardcoded entries, users would not be able to read any other thermal zones without changing the code. But yes, if someone has a thermal zone that is slow to read that is not iwlwifi, they would see the slowdown. This is why I think having an option is the best idea and it's in line with other options controlling network interfaces and batteries' status displays in the taskbar.
Are these priorities read just once on startup? That would be justified.
What bothers me is that much more is read than is used. Could we read just what is actually going to be shown?
You may consider reading the temperature at a slower rate, like once every two seconds. Would be efficient.
Yes maybe right now it's not the most efficient. Every TaskBarCPUDelay
milliseconds, all thermal zone types are read, and any nonnegative priority thermal zones that fit into the CPU graph have their temps read.
I'm not sure about once on startup. Is it not possible for a thermal zone to appear after startup? I also don't like hardcoding an update rate without having an option to configure it. Currently, it's updated at the same rate the CPU graph is updated (default 500 ms).
If we only allow hardcoded entries, users would not be able to read any other thermal zones without changing the code.
I explained before why adding an option is worse. I checked the entries and they suffice. If another ever pops up then someone can notice us and we'll add it to a new release. We don't care about whole new packages of CPUs appearing. That is so exceptional and only to be expected on exotic servers. Otherwise users can still icewm -r
. This is a minor minor feature, so keep it low-profile. Move everything temperate to a separate class YTemp. That gives much more modular expressivity. Replace the special case of tzPriority ignored devices from -1 to 0. I.e., then we only consider non-zero priority devices and we get rid of the signed
char.
Commit 2a70667 implements my current thinking about this issue.
Some thermal zones' temperatures take a long time to read (e.g. 0.5 seconds). When ACPI temperatures are displayed in the CPU graph, these temperatures are polled repeatedly, and any blockage, blocks the entire window manager. This change adds an option, IgnoreThermalZones, which allows any such offending thermal zones to be skipped and not slow down the window manager.
Also, I changed the temperature display in the CPU graph to display all (non-ignored) ACPI temps. Let me know if you think this should be adaptive based on the width of the CPU monitor (with a low-width CPU monitor, they won't fit), or configurable in some way.