vedderb / bldc

The VESC motor control firmware
2.18k stars 1.35k forks source link

[LispBM] Shutdown event issues #686

Closed dcodeIO closed 9 months ago

dcodeIO commented 10 months ago

As a follow-up to #666, I found that when using shutdown-hold, depending on how fast the power switch is cycled, shutdown respectively shutdown cancellation is not working reliably.

For example, when delaying shutdown with shutdown-hold true, then switching the power button on again while still performing the desired shutdown operation that would end with a shutdown-hold false, the VESCs will stay on as expected but (iiuc) the gate remains disabled, so it is no longer possible to drive the motors. It also appears that once in this state, the power switch is no longer functioning, so the VESCs will remain on (i.e. cannot be turned off) regardless of power switch state unless manually removed from power.

To illustrate:

VESC-shutdown drawio

Perhaps the power switch is no longer sampled, and gate reactivation logic does not trigger because the shutdown operation now takes longer? Mostly guessing at this point, though :)

dcodeIO commented 10 months ago

Perhaps, one way to tackle issues of this kind could be to generalize programmatic shutdown a little more. That is, instead of "holding" shutdown, there could be (shutdown-abort) to cancel shutdown entirely when event-shutdown is received, (shutdown) to shut down programmatically once ready to do so, and (get-powerswitch) to get and monitor the power switch state. Then, in addition to powering on as usual, the application can take control of shutdown behavior whereas the firmware only provides the necessary APIs instead of implementing a specific behavior?

Edit: Figured that in our use case this could be even simpler: Use ALWAYS_ON plus provide programmatic (shutdown) and (get-powerswitch), so holding or cancelling the shutdown event is not necessary since the VESC won't shutdown automatically anyway. That is, if ALWAYS_ON also uses the power switch to turn on.

vedderb commented 9 months ago

The intention is to enable the gates and abort shutdown if it does not succeed after two seconds (as can happen of the button is toggled in-between). Maybe there is something wrong in the logic though. Can you describe the exact steps to reproduce the problem together with a small example lbm program?

dcodeIO commented 9 months ago

I don't have a functioning setup at hand right now, unfortunately, but I would imagine that the following program should demonstrate the issue:

(defun do-startup () {
  (looprange i 0 5 {
    (set-duty 0.1)
    (yield 500000)
  })
  (set-duty 0.0)
})

(defun do-shutdown () {
  (looprange i 0 5 {
    (set-duty -0.1)
    (yield 500000)
  })
  (set-duty 0.0)
})

(do-startup)

(event-register-handler (self))
(event-enable 'event-shutdown)

(loopwhile true
  (recv
    (event-shutdown {
      (shutdown-hold true)
      (print "Performing action upon shutdown...")
      (do-shutdown)
      (print "Action complete. Shutting down...")
      (shutdown-hold false)
      ; LBM should terminate now unless
      ; power has been switched back on.
      (yield 500000)
      (print "Shutdown cancelled. Restarting...")
      ; If cancelled, this should work:
      (do-startup)
    })
  )
)

(unreachable)

Upon boot, motor should drive forwards for a bit. Upon shutdown, motor should drive backwards for a bit before the VESC is turned off. Issues appears to arise when cycling the power switch between shutdown-hold true and shutdown-hold false, in that motors won't move anymore and/or the VESC won't turn off afterwards.

Will take a closer look as soon as I have a working setup again, but perhaps that's helpful already?

dcodeIO commented 9 months ago

Got around to testing again meanwhile. Appears that the above program is too stripped-down to show the issues originally described. Will try to improve. However, it seems to show that the shutdown event is emitted only once, whereas subsequent toggles skip the event and shut the VESC down immediately.

Btw, I forgot to mention that the power switch setting used is ALWAYS_OFF. Using a 60_75_MK2 atm, 6.05 beta 22.

dcodeIO commented 9 months ago

Turns out that the responsiveness issues might have been something on my end, whereas the latter issue can reliably be reproduced:

Unfortunately, I wasn't able to pinpoint what's causing this behavior yet. Tried a few changes to the firmware, adding HW_SHUTDOWN_HOLD_ON and/or ENABLE_GATE in suspicious places, but so far to no avail.

HW is 60_75, FW origin/master (beta 24).

dcodeIO commented 9 months ago

Found a hackish way to mitigate the issue. Using the following test program

(def busy false)

(loopwhile-thd 160 true {
  (set 'busy true)
  (puts "Busy for 10 seconds...")
  (set-duty 0.1)
  (yield 10000000)
  (set-duty 0.0)
  (set 'busy false)
  (puts "Idle for 3 seconds...")
  (yield 3000000)
})

(loopwhile-thd 160 true {
  (if busy (timeout-reset))
  (yield 100000)
})

(defun event-handler ()
  (loopwhile true
    (recv
      (event-shutdown {
        (puts "Shutdown event received")
        (shutdown-hold true)
        (if busy {
          (puts "Waiting to become idle...")
          (loopwhile busy {
            (puts "...")
            (yield 1000000)
          })
          (puts "Now idle.")
        })
        (puts "Shutdown.")
        (shutdown-hold false)
        ; LBM should terminate now unless
        ; power has been switched back on.
        (yield 500000)
        (puts "Shutdown cancelled")
        (shutdown-undo) ; <-- HERE
      })
    )
  )
)

(def event-handler-id (spawn event-handler))
(event-register-handler event-handler-id)
(event-enable 'event-shutdown)

where shutdown-undo is a custom extension doing

static lbm_value ext_shutdown_undo(lbm_value *args, lbm_uint argn) {
    (void)args; (void)argn;
    HW_SHUTDOWN_HOLD_ON();
    ENABLE_GATE();
    SHUTDOWN_SET_SAMPLING_DISABLED(false);
    return ENC_SYM_TRUE;
}

the issue can be mitigated. Not a proper fix, but seems to indicate that one of these settings is not reset when shutdown is cancelled for the first time, so the second time the VESC just turns off.


Update: Interestingly, putting the undo logic into shutdown_hold directly, like so

void shutdown_hold(bool hold) {
    if (!hold && !shutdown_button_pressed()) {
        // Shutdown has been cancelled. Make sure state is reset.
        HW_SHUTDOWN_HOLD_ON();
        ENABLE_GATE();
        SHUTDOWN_SET_SAMPLING_DISABLED(false);
    }

    m_shutdown_hold = hold;
}

appears to run too early to mitigate the issue, so there might be something going wrong after shutdown_hold(false) has been invoked. Was hoping that would be a PR-able fix, but sadly isn't.


Update: Found that the relevant call in shutdown-undo is HW_SHUTDOWN_HOLD_ON(), whereas ENABLE_GATE() etc. isn't necessary.

Perhaps the problem originates in HW_SHUTDOWN_HOLD_OFF() being always executed in do_shutdown, even if during while (m_shutdown_hold) { ... } the power switch has been turned back on. Looks like do_shutdown should instead return false in this case, leaving gates and hold as-is. That behavior, however, might not be applicable to all hardware and power switch modes, so do_shutdown might not be the right place to sample the power switch again.


Update: See proposed fix.