YunoHost / package_check

Shell script which check package actions: install, remove, upgrade, backup, restore…
GNU General Public License v3.0
21 stars 25 forks source link

[enh] Measure and display app's base install average resource consumption (HDD+RAM+CPU) #139

Open oleole39 opened 1 year ago

oleole39 commented 1 year ago

Hello,

Idea

I would find it quite useful to see in the web admin an indication of the resources required to install a given app.

To do so, either appropriate values could be written manually by each app packager, or more conveniently it could be checked automatically within a testing script such as package_check.sh.

Use case

This comes from a uncomfortable experience I had installing an app which was using much more RAM than I would have thought, and took my server to its limit, making it unresponsive.

Approach

I am unsure of what would be the best approach to do so. If I have not dug much in package_check script, I am under the impression it could perform the measurement by checking what refers to $user_id, the dedicated system user created for the app during its installation:

  1. creating an install of the app to measure. [edit] Ideally, it would measure the resources required for the installation (RAM in particular) for the installation processes involving compiling. [/edit]
  2. measure disk space used by the app and the added components
    • find / -user $user_id -type d -exec du -scm {} + -prune | grep -Po '\d*(?=\Dtotal)' (outputs size in megabytes according to man du). This command would need further testing to make sure the result is accurate.
  3. measuring base (i.e. at post-install, without any active of the app) CPU & RAM consumption; maybe those measurements should be done a few times and an average be calculated for better accuracy. a. RAM:
    • ps aux | egrep ^$user_id | awk 'BEGIN{total=0}; {total += $4};END{print total}' (outputs % of RAM used), to be multiplied with the amount of RAM available in megabytes free -m | egrep ^Mem: | awk '{ print $2 }'
      b. CPU:
    • ps aux | egrep ^$user_id | awk 'BEGIN{total=0}; {total += $3};END{print total}' (outputs % of CPU used) - not sure how this could be meaningful for our purpose though.
  4. Save values as minimum required resources within the manifest. [edit] Those values would then be displayed in each catalog's app description page, and when a user would launch an install, a quick check would be performed before starting the install script comparing those values with the one available on the target server. In case of mismatch, a warning pop-up would be displayed waiting for confirmation of the user to continue further or stop here.[/edit]

What do you think ? Does the current workflow at YNH makes would make the package_check's script as the one to modify?

alexAubin commented 1 year ago

In fact manifest/packaging version 2 includes such an info, and we merged yesterday a change in package check to add metrics displayed after install/upgrade/restore : https://github.com/YunoHost/package_check/pull/136 though sometimes it reports a negative RAM usage so there might be some things to improve :sweat_smile: But having such metrics was the missing piece to easily add a non-dummy info in the manifest for v2 apps

YunoHost will complain if there's insufficient RAM before installing an app versus what's declared in the manifest

oleole39 commented 1 year ago

Exactly the UX behavior proposal I was adding to my initial message just while you were answering (cf. [edit] tag)... What a fast move from YNH! 😉

I see the merged changes focus on RAM. Could we imagine similar metrics for HDD & CPU ?

alexAubin commented 1 year ago

CPU seems kinda hard imho because it's really not a well-defined quantity ... It depends too much on cores and clock frequency and probably other stuff ...

HDD is okay and yeah a metric would be nice. There's already a key for this in the manifest.

RAM really is the biggest deal because there's typically a peak RAM usage during build that make the install crash, and that's really the main focus point to have with this topic. And then "usage RAM" is nice to have, but like HDD is not super well defined because it depends on the load on the server, etc

oleole39 commented 1 year ago

CPU seems kinda hard imho because it's really not a well-defined quantity ... It depends too much on cores and clock frequency and probably other stuff ...

Indeed. I found some documentation on related topic but I need to read further to see if I can understand something simple enough out of this.

And then "usage RAM" is nice to have, but like HDD is not super well defined because it depends on the load on the server, etc

We could have base usage RAM & HDD space, i.e. base use of resources post-install without any users. For a server with little resources it already gives a good hint on whether you want to install a given app or not if for example it would account for more than half your total resources without any user.

oleole39 commented 1 year ago

Bonjour @alexAubin,

HDD is okay and yeah a metric would be nice.

I will work on a PR on this topic based on the implementation of the RAM metrics you pointed out to me.

CPU seems kinda hard imho because it's really not a well-defined quantity ... It depends too much on cores and clock frequency and probably other stuff.

So I read more on the topic. Complex task indeed to compare CPU, as it depends as far as I understood mainly on number of cycles per clock tick (which the number of cores impacts), clock frequency and CPU optimization schemes in the treatment of instructions (several instructions may be treated within the same cycle).
However, I would like to propose a metrics which gives a normalized (all cores taken together) CPU load for a given system user (the one owning the tested app) and averaged over a given period (20sec for instance - or it could be more depending on how long the script can reasonably run). This could allow defining a base (i.e. at fresh install without users) CPU load profile (high, medium, low) for a given app , relative to the CPU of the test machine. As such it would be a very rough indicator, still I believe it could be interesting to help YNH instance admins' choices. I can propose a PR for this as well so that the data is included to package_check report in the same spirit than RAM related metrics.

To go further, one could even imagine that YNH core would extract the name of local machine's CPU model (or perform a small CPU benchmark - many open source tools already exist for this) and compare it to a (local ?) benchmarking database (several online open CPU benchmark databases exists), to ponderate the app's CPU score/profile according to he gap between the CPU model of the test machine on which the score/profile was initially defined and local machine's CPU model.

Need help to be sure to understand correctly package_check

It happens to be complicated for me to install LXC and test package_check - if possible, would you mind letting me know whether my understanding of that tool is correct ?

alexAubin commented 1 year ago

If you want to have fun with the CPU thing, be my guest, but I reiterate that it sounds like a very ill-defined notion, and even if it was, it looks like a lore of effort for something which is gonna be virtually useless for the average yunohost admin and doesn't solve any actual issue (cf the discussion about RAM which is causing install crashes and therefore is quite important). I don't know what you hope to measure, considering that an app which just got installed should basically use 0 CPU because it doesn't receive any request ... Or maybe measuring the CPU usage of the app build, but then good luck differentiating the CPU usage of the processes related to the app build from the ones related to the basic YunoHost system stack @_@

oleole39 commented 1 year ago

If you want to have fun with the CPU thing, be my guest, but I reiterate that it sounds like a very ill-defined notion

Not my intention to do it for fun :) With the idea that (in particular) admins of low-cost servers with small hardware config could constitute their set of self-hosted apps directly from the catalog (and not necessarily having to try various iterations of app mix just to find out what can eventually fit on the server), a full app profile would in my mind ideally cover the need of the latter in terms of CPU, RAM & HDD space. RAM & HDD are quite easy to measure, but I share your view about CPU metrics being an ill-defined notion. However I was wondering whether a rough & relative indicator could help.

I don't know what you hope to measure

I indeed thought about measuring both

an app which just got installed should basically use 0 CPU because it doesn't receive any request

Good point, which was actually missing from my reasoning... That probably means that in order to have consistent values a test scenario would need to be created for each app, but yes that makes quite a lot of study and implementation work which I am not ready to do, at least for now.

differentiating the CPU usage of the processes related to the app build from the ones related to the basic YunoHost system stack

Well I thought about using the same approach than the one currently used for RAM: measuring before install (that should gives CPU load of basic YNH system?), measuring peak and average value during install and calculate the difference with initial state get a rough CPU load. Would that sound relevant to you? If not I don't mind dropping the idea of CPU metrics.

Regarding HDD space metrics though, this is quite easy to implement, but I wonder whether if the current app tested is let's say Nextcloud, $TEST_USER variable in package_check would be nextcloud, so that this command would work as I expect find / -user $TEST_USER -exec du -scm {} + -prune | grep -Po '\d*(?=\Dtotal)'. Would you be able to confirm that point?