theAgingApprentice / HexaFloorRide

Arduino based six legged robot project
MIT License
0 stars 0 forks source link

create some performance metric measurements, notably CPU usage #174

Closed nerdoug closed 1 year ago

nerdoug commented 1 year ago

Partly to assist with investigating the jagged leg movements, we need some measurements of how much CPU our various software tasks are taking, and what our total CPU usage is under various circumstances

nerdoug commented 1 year ago

Some thoughts:

There are a couple of reasons for splitting up long tasks dispatched from loop():

Should we be measuring the CPU that network stuff runs on, or moving some of our tasks to it?

nerdoug commented 1 year ago

Initial CPU usage data is surprising. On an idle system, the percentage CPU used by the routines called in loop() is::

M) setup{7}> End of setup ############################################
  Flow: 0.00  Web: 0.00  Oled: 0.00  Mqtt: 0.00  1Sec: 0.00
  Flow: 0.00  Web: 98.44  Oled: 0.10  Mqtt: 0.50  1Sec: 0.05
  Flow: 0.00  Web: 98.46  Oled: 0.10  Mqtt: 0.50  1Sec: 0.04
  Flow: 0.00  Web: 98.45  Oled: 0.10  Mqtt: 0.50  1Sec: 0.04
  Flow: 0.00  Web: 98.45  Oled: 0.10  Mqtt: 0.50  1Sec: 0.04
  Flow: 0.00  Web: 98.46  Oled: 0.10  Mqtt: 0.50  1Sec: 0.04
  Flow: 0.00  Web: 98.45  Oled: 0.10  Mqtt: 0.50  1Sec: 0.04
  Flow: 0.00  Web: 98.45  Oled: 0.10  Mqtt: 0.50  1Sec: 0.04
  Flow: 0.00  Web: 98.49  Oled: 0.10  Mqtt: 0.47  1Sec: 0.04
  Flow: 0.00  Web: 98.55  Oled: 0.10  Mqtt: 0.40  1Sec: 0.05
  Flow: 0.00  Web: 98.56  Oled: 0.10  Mqtt: 0.40  1Sec: 0.04
  Flow: 0.00  Web: 98.56  Oled: 0.10  Mqtt: 0.40  1Sec: 0.04

Some thoughts:

Possible actions to address CPU utilization issues:

nerdoug commented 1 year ago

I changed the way monitorWebServer is called so it's based on a timer, and only done every 200 msec. The CPU stats now look like this:

M) setup{7}> End of setup #############################################################################
  Flow: 0.00  Web: 0.00  Oled: 0.00  Mqtt: 0.00  1Sec: 0.00
  Flow: 0.00  Web: 0.56  Oled: 5.28  Mqtt: 21.25  1Sec: 0.05
  Flow: 0.00  Web: 0.56  Oled: 5.27  Mqtt: 21.24  1Sec: 0.04
  Flow: 0.00  Web: 0.56  Oled: 5.29  Mqtt: 21.26  1Sec: 0.04
  Flow: 0.00  Web: 0.56  Oled: 5.28  Mqtt: 21.26  1Sec: 0.04
  Flow: 0.00  Web: 0.56  Oled: 5.26  Mqtt: 21.24  1Sec: 0.04
  Flow: 0.00  Web: 0.56  Oled: 5.26  Mqtt: 21.25  1Sec: 0.04
  Flow: 0.00  Web: 0.56  Oled: 5.29  Mqtt: 21.23  1Sec: 0.04
  Flow: 0.00  Web: 0.56  Oled: 5.27  Mqtt: 21.23  1Sec: 0.04

As expected, the monitorWebServer routine is now only occupying a small portion of the CPU. The other routines that are checked at the spin lock rate have gathered more of the CPU that's now available, but the timer driver oneSec routine stayed stable. So I should make the same fix to the other 2 routines dispatch mechanisms, and make the CPU usage output controllable via the trace mechanism. Then check to see if leg movements are still jagged.

The CPU stats should help us figure out if the delay(10) in checkOledButtons() actually stops our task dispatching for 10 msec. If that's the case, we can split the routine at the delay, and have the second part dispached via a timer.

nerdoug commented 1 year ago

This got us down to expected CPU consumption on idle robot as these percentage usage numbers show:

M) setup{7}> End of setup ####################################################
M) oneSec{14}>   CPU Total= 0.00   Flow: 0.00  Web: 0.00  Oled: 0.00  Mqtt: 0.00  1Sec: 0.00
M) oneSec{14}>   CPU Total= 0.64   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.09
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.63   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.63   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.09

Tested by doing a move leg to angles command: MLA, 1, 0, 0, 0 which showed this usage:

M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
  Hip angle, PWM= 0.00 295
 Knee angle, PWM= 0.00 290
Ankle angle, PWM= 0.00 300
M) oneSec{14}>   CPU Total= 0.80   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.18  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.64   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.10
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08

there's a bit of MQTT, but no flow, which is strange. We'll see what pushups does to CPU:

M) setup{7}> End of setup #####################################################
M) oneSec{14}>   CPU Total= 0.00   Flow: 0.00  Web: 0.00  Oled: 0.00  Mqtt: 0.00  1Sec: 0.00
M) oneSec{14}>   CPU Total= 0.74   Flow: 0.00  Web: 0.64  Oled: 0.00  Mqtt: 0.00  1Sec: 0.09
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.53  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.53  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.53  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.85   Flow: 0.00  Web: 0.53  Oled: 0.00  Mqtt: 0.23  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.97   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.34  1Sec: 0.09
M) oneSec{14}>   CPU Total= 0.98   Flow: 0.00  Web: 0.53  Oled: 0.00  Mqtt: 0.35  1Sec: 0.09
M) oneSec{14}>   CPU Total= 0.96   Flow: 0.00  Web: 0.54  Oled: 0.00  Mqtt: 0.33  1Sec: 0.09
L) do_flow{13}>  start of flow row # 0  
L) do_flow{13}> start of flow row #1  
M) oneSec{14}>   CPU Total= 37.23   Flow: 36.63  Web: 0.37  Oled: 0.00  Mqtt: 0.14  1Sec: 0.09
L) do_flow{13}>  start of flow row #  2
L) do_flow{13}>  start of flow row #  3
M) oneSec{14}>   CPU Total= 14.58   Flow: 13.93  Web: 0.54  Oled: 0.00  Mqtt: 0.01  1Sec: 0.10
L) do_flow{13}>  start of flow row #  4
L) do_flow{13}>  start of flow row #  5
M) oneSec{14}>   CPU Total= 15.31   Flow: 14.67  Web: 0.54  Oled: 0.00  Mqtt: 0.01  1Sec: 0.09
L) do_flow{13}>  start of flow row #  6
M) oneSec{14}>   CPU Total= 14.97   Flow: 14.34  Web: 0.54  Oled: 0.00  Mqtt: 0.01  1Sec: 0.09
L) do_flow{13}>  start of flow row #  7
M) oneSec{14}>   CPU Total= 17.49   Flow: 16.86  Web: 0.54  Oled: 0.00  Mqtt: 0.01  1Sec: 0.09
L) do_flow{13}>  start of flow row #  8
M) oneSec{14}>   CPU Total= 17.52   Flow: 16.89  Web: 0.54  Oled: 0.00  Mqtt: 0.01  1Sec: 0.09
L) do_flow{13}>  start of flow row #  9
L) do_flow{13}>  start of flow row #  10
M) oneSec{14}>   CPU Total= 17.54   Flow: 16.91  Web: 0.54  Oled: 0.00  Mqtt: 0.01  1Sec: 0.09
L) do_flow{13}>  start of flow row #  11
M) oneSec{14}>   CPU Total= 17.19   Flow: 16.56  Web: 0.53  Oled: 0.00  Mqtt: 0.01  1Sec: 0.09
L) do_flow{13}>  start of flow row #  12
M) oneSec{14}>   CPU Total= 15.28   Flow: 14.65  Web: 0.54  Oled: 0.00  Mqtt: 0.01  1Sec: 0.09
L) do_flow{13}>  start of flow row #  13
L) do_flow{13}>  start of flow row #  14
M) oneSec{14}>   CPU Total= 15.00   Flow: 14.36  Web: 0.54  Oled: 0.00  Mqtt: 0.01  1Sec: 0.09
L) do_flow{13}>  start of flow row #  15
L) do_flow{13}>  start of flow row #  16
M) oneSec{14}>   CPU Total= 14.98   Flow: 14.35  Web: 0.54  Oled: 0.00  Mqtt: 0.01  1Sec: 0.09
L) do_flow{13}>  start of flow row #  17
L) do_flow{13}>  start of flow row #  18
M) oneSec{14}>   CPU Total= 14.84   Flow: 14.21  Web: 0.54  Oled: 0.00  Mqtt: 0.01  1Sec: 0.09
L) do_flow{13}>  start of flow row #  19
M) oneSec{14}>   CPU Total= 15.90   Flow: 15.27  Web: 0.54  Oled: 0.00  Mqtt: 0.01  1Sec: 0.09
L) do_flow{13}>  end of multi row flow processing  
M) oneSec{14}>   CPU Total= 6.10   Flow: 5.47  Web: 0.54  Oled: 0.00  Mqtt: 0.01  1Sec: 0.09
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.53  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08
M) oneSec{14}>   CPU Total= 0.62   Flow: 0.00  Web: 0.53  Oled: 0.00  Mqtt: 0.00  1Sec: 0.08

Leg performance still seems jagged to me. Will have to analyse above numbers. May also add to CPU metrics to monitor how much timer driver task dispatches are delayed. I have a design for that on paper and it shouldn't be too hard to ad if we need to.

Analysis notes: The 36.63 % CPU for flow recorded after it did row 1 is a red herring. There's a 340 msec delay in that code for the worst case servo movement to the initial position, which would be included in the elapsed time, as 34% of the CPU.

Leg movements still seem jagged to me, although maybe a bit better. Had console I/O minimized for this test, and definitely no I/O at the per frame level. Guess I'll have to ad the code that checks for delayed timer based dispatches.

Another viewpoint: we're using about 15% of the CPU for flow processing while the legs are moving. If we're doing 20 frames per second, thats 15/20 == 0.75 % of the CPU per frame. 1000 msec per second * 0.75% = 7.5 msec of elapsed time, assuming we're taking all of the CPU. If we're doing 20 frames per second, we have 50 msec per frame, so using up 7.5% on calculations shouldn't be a problem.

I'll shift the jagged performance discussion to the appropriate issue, and continue with the code to track dispatching delays for timer driven tasks.

nerdoug commented 1 year ago

I'm about to upload the branch that implements CPU performance measurements, but I wanted to note some related things that snuck into the branch:

Related work remaining to be done includes:

I'm going to create an issue where I'll record topics I want to discuss / propose / explain so I don't forget about them.