JPEWdev / icecream-sundae

Commandline Monitor for Icecream
GNU General Public License v2.0
50 stars 15 forks source link

Can icecream-sundae be used to automate collection of performance statistics? #9

Open timblaktu opened 4 years ago

timblaktu commented 4 years ago

I wondered if it's possible to run this in a "one shot" mode that can be used to reset and collect stats from the icecc build cluster to aid in performance tuning?

Perhaps something like using the "stats port" in distcc?

JPEWdev commented 4 years ago

Quite possibly. It depends on what needs to be collected and how it should be presented to the user. I don't want to side track icecream-sundae (the graphical program) too much (again, depending on what should be displayed), but we could write a second program in this project that leverages the same code to display it differently if needed.

Did you have an idea or example of what might be useful?

timblaktu commented 4 years ago

Independent of any knowledge of how icecream-sundae interfaces with iceccd/icecc-scheduler, a coworker and I came up with a table format for the data snapshots we'd like to see after each build tuning scenario/experiment:

hostname ICECC_MAX_JOBS ICECC_ALLOW_REMOTE ICECC_NETNAME ICECC_NICE_LEVEL IS_SCHEDULER ACTUAL_JOBS ACTUAL_AVG_JOB_TIME ACTUAL_AVG_JOB_BYTES ACTUAL_LOAD_AVG
iceccd1                  
iceccd2                  
iceccsched1                  
dev1                  
dev2                  

This is just a quick idea/sample, but the "ICECC" columns are of course static daemon config params which set the stage for the "scenario" and are easy to read from each node's /etc/icecc/icecc.conf. The "ACTUAL" columns would be read from each icecc daemon/scheduler node, at the end of a build experiment. Perhaps the assumption would be that the user restarted the scheduler (and/or the daemons, if necessary) before the experiment to reset all counters. Or perhaps the tool would simply snapshot before and after and present the diff.

Thanks for your consideration, I think such a tool could add significant value to the community of builders using icecc.

JPEWdev commented 4 years ago

I saw your post on the icecream mailing list. One of the issue with what you described is that there is no real "snapshot" API between the monitors and the scheduler (at least that I'm aware of). When you start a monitor, you only receive the new events that occur after the connection is established, and there isn't a way to get the current state of the cluster. The scheduler does know this information, because it is capable of reporting an instantaneous snapshot in the telnet interface (which I recommend you look into), it just can't report it to the clients.

However, the lack of a snapshot API would be fine if you want to run some sort of measurement monitor while you perform a test. In that case, you only care about the new events anyway.

The columns you are proposing seem reasonable. Most of them you can get already, the few missing ones being the static daemon config as you pointed out. I think you can add those pretty easily though. IIRC the daemons report a generic set of key-value string pairs for a lot of their attributes (e.g. load average, memory usage, etc.); making the daemons report part of their static config in that way should be trivial.

Finally, you may be interested in a talk I gave at the 2019 Embedded Linux Conference about using icecream to speed up builds with the Yocto Project: https://youtu.be/VpK27pI64jQ

timblaktu commented 4 years ago

Joshua, thanks for your feedback, and thanks for icecream-sundae. I haven't had time to reverse engineer from the source code the meanings of all the items returned by the telnet interface commands, but certainly it looks like the thing I'm looking for.

p.s. Your presentation was very helpful, thanks for sharing. I'd love to see more things like this in a references section of the icecream main README. (I first listened to it while running with my garmin watch. ;-)) I'll connect with you later via email - we seem to be working with a lot of the same "stacks" and tools (c++ apps on yocto embedded linuxes on custom hw), and I think there's a mutual opportunity for sharing and learning.