Ricks-Lab / gpu-utils

A set of utilities for monitoring and customizing GPU performance
GNU General Public License v3.0
136 stars 23 forks source link

suggestion: provide option to report averages #68

Closed csecht closed 2 years ago

csecht commented 4 years ago

When optimizing certain GPU run parameters, it would be handy to have an option for amdgpu-monitor or amdgpu-plot, or both, to report averages of GPU performance variables like load, power, and clock speed. I suppose that reporting a cumulative moving average using a user-specified time interval would be most useful. For example, assuming default reporting every 3 sec, provide an option to monitor or graph moving averages of those values for 1, 3, or 10 minute windows.

Ricks-Lab commented 4 years ago

Perhaps an option to specify a moving average window size on the command line. It not specified, moving averages would not be displayed, else the gui version of gpu-mon would display value (moving avg). This biggest complexity is that the utility would need to keep a history of past values, which would be overhead that it currently doesn't have. gpu-plot does store history for plots, but it would be difficult to integrate moving average into the plot gui.

csecht commented 4 years ago

A command line option for a moving average would be handy, but it’s frosting on the cake, and may not be worth the extra bloat. Maybe wait to see whether other users voice a desire for that option.

On Jul 5, 2020, at 8:40 PM, Rick notifications@github.com wrote:

Perhaps an option to specify a moving average window size on the command line. It not specified, moving averages would not be displayed, else the gui version of gpu-mon would display value (moving avg). This biggest complexity is that the utility would need to keep a history of past values, which would be overhead that it currently doesn't have. gpu-plot does store history for plots, but it would be difficult to integrate moving average into the plot gui.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Ricks-Lab/gpu-utils/issues/68#issuecomment-653973349, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALMVCQV77JJRFKKR76TYJJTR2ETQTANCNFSM4MHKC26Q.

Ricks-Lab commented 4 years ago

Perhaps an approximation of a moving average would work. Here is a discussion on StackOverflow.

csecht commented 4 years ago

Interesting. If I understand it, from Pixelstix’s variation of Abdullah’s method in the StackOverflow link,

new_average = (old_average * (n-1) + new_value) / n

you would only need to store n values initially, say n=5, to get an initial average, and from then on calculate a new_average (to display in gpu-mon) from each new value, thereafter having to store only the old_average. So, to monitor an approximated moving average of 5 readings, the equation becomes,

new_average = (old_average * 4 + new_value) /5

Seems, simple. What am I missing?

As I laid it out, there would be an initial time lag before an average is displayed in the monitor, then it is updated every monitor cycle. During the lag, the average field value could display a “calc…” text, or some such. A more complicated option may be to have an “average” value displayed from the get-go by having n start at 1 and increment by 1 each cycle until 5 is reached. Having n=5 is just a guess for a useful window size. I suppose different n’s would have to be tested to see what is most useful and workable for different GPU parameters, but I can’t imagine n would need to be more than 10. The idea isn’t so much mathematical accuracy, as it is to give users a clearer idea how some variable GPU parameter changes in response to different PAC settings or boinc-client settings, or task types, etc.

On Jul 8, 2020, at 12:34 AM, Rick notifications@github.com wrote:

Perhaps an approximation of a moving average would work. Here is a discussion on StackOverflow https://stackoverflow.com/questions/12636613/how-to-calculate-moving-average-without-keeping-the-count-and-data-total.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Ricks-Lab/gpu-utils/issues/68#issuecomment-655296581, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALMVCQX3UCIVVWCWVCUT2W3R2QANLANCNFSM4MHKC26Q.