Observe and perform state transitions with YARP control modes

PeterBowman commented 5 years ago

YARP's control API exposes several vocabs related to initialization and fault states, per IControlMode.h:

// Read / Write
constexpr yarp::conf::vocab32_t VOCAB_CM_IDLE = yarp::os::createVocab('i','d','l');

// Write only (only from high level toward the joint)
constexpr yarp::conf::vocab32_t VOCAB_CM_FORCE_IDLE = yarp::os::createVocab('f','i','d','l');

// Read only (imposed by the board on special events)
constexpr yarp::conf::vocab32_t VOCAB_CM_HW_FAULT = yarp::os::createVocab('h','w','f','a');
constexpr yarp::conf::vocab32_t VOCAB_CM_CALIBRATING = yarp::os::createVocab('c','a','l'); // the joint is calibrating
constexpr yarp::conf::vocab32_t VOCAB_CM_CALIB_DONE = yarp::os::createVocab('c','a','l','d'); // calibration successfully completed
constexpr yarp::conf::vocab32_t VOCAB_CM_NOT_CONFIGURED = yarp::os::createVocab('c','f','g','n'); // missing initial configuration (default value at start-up)
constexpr yarp::conf::vocab32_t VOCAB_CM_CONFIGURED = yarp::os::createVocab('c','f','g','y'); // initial configuration completed, if any

Their control specs (pdf) describe them as follows:

3.1. ‘Hardware fault’ status

The “hardware fault” status is used to signal both an hardware fault (e.g. broken encoder, overcurrent ecc.), or a user fault (e.g. trying to set a control mode which is not implemented by the control board).

When the board is in “hardware fault”, the controller is off (no PWM is given to the motor).

If a control board is faulted, the user has to send the special command setControlMode(VOCAB_CM_FORCE_IDLE) to reset the fault before choosing any other desired control mode (Figure 3). Requests to switch to other modalities (including VOCAB_CM_IDLE) are ignored. The purpose is to prevent any automatic switch (e.g. from a user software module) from the fault status to an active control status.

If the hardware fault is successfully cleared by the VOCAB_CM_FORCE_IDLE command, getControlMode() will return a standard VOCAB_CM_IDLE. If the fault persists (e.g. a critical sensor is broken) the board will remain in the fault status. Note that VOCAB_CM_FORCE_IDLE is never returned by the method getControlMode().

Sending the VOCAB_CM_FORCE_IDLE special command when the board is not in fault status has the same effect of setting VOCAB_CM_IDLE control mode.

3.2. ‘Calibration in progress’ status

A board which is currently performing the calibration procedure internally sets its control mode to the special status “calibration in progress” .

In this special status the controller is on (PWM is given to the motor). The control algorithm depends on the specific type of calibrator (e.g. absolute encoder, movement to the hardware limit etc). This is not discussed in this document.

In this special status the interaction mode is always ignored.

Transition rules:

The calibration procedure is requested during the initialization phase. See Section 3.3

The calibration procedure is requested by the user during normal operation. In this case:

The joint status is set to calibration in progress.

The calibration algorithm is executed.

After the calibration, the control mode and interaction mode is set to a value decided by the calibrator type. Default value on the iCub platform is: (...)

If the calibration fails, the board sets its internal status to ‘hardware fault’.

3.3. ‘Not Configured’ status

A board which has been just turned on internally sets its control mode to the special status “not configured”.

In this special status the controller is off (no PWM is given to the motor).

In this special status the interaction mode is always ignored.

The following configuration values must be sent to the board in order to complete the configuration phase:

All the controller parameters (PID gains, SW joints limits, max output, max current etc.)

All the calibration parameters (encoders zeroes, etc.) [ONLY FOR JOINTS REQUIRING CALIBRATION]

Transition rules:

If a joint requires calibration, then the joint exists from “not configured” status when the calibration command is received. The control mode is thus set to “calibration in progress”.

If a joint does not require calibration, then the joint exists from “not configured status” when all the controller parameters are received. The control mode is thus set to the default value here recommended: idle status (...).

This task will aim to represent iPOS internal states with the above YARP vocabs. See state diagram (also at https://github.com/roboticslab-uc3m/yarp-devices/issues/120#issuecomment-502392793):

states

Proposed mappings:

VOCAB_CM_NOT_CONFIGURED: initial default value ("control mode" in YARP's parlance)
VOCAB_CM_CONFIGURED: everything set (Operation Enabled) we are about to select a control mode
VOCAB_CM_IDLE: default control mode in Operation Enabled state, we can switch back e.g. from position control to this state in order to stop controlling joints (they can be moved freely)
VOCAB_CM_FORCE_IDLE: transition from Fault to Operation Enabled, resets control mode (VOCAB_CM_IDLE)

PeterBowman commented 5 years ago

Didn't know how to treat calibration vocabs (VOCAB_CM_CALIBRATING, VOCAB_CM_CALIB_DONE), but turns out we could use them for homing procedures (--homePoss):

yarp::dev::IControlCalibration (implemented by CBW2)
yarp::dev::IRemoteCalibrator(implemented by CBW2)
yarp::dev::ICalibrator

PeterBowman commented 5 years ago

Added CiA drive state machine at https://github.com/roboticslab-uc3m/yarp-devices/commit/972f8758d246bbb2a85c40cf5af11bd5855597a0.

PeterBowman commented 5 years ago

Homing procedure refactored at https://github.com/roboticslab-uc3m/yarp-devices/commit/e1995b68e0422d4684a47990e50a798bb0168ce9. Turns out the yarp::dev::IControlCalibration & yarp::dev::ICalibrator combo does not support homing commands (just parking - I guess it was aimed for the iKart - and "usual" calibration). Also, it seems to me it's more dated than yarp::dev::IRemoteCalibrator. This one is meant for use by network wrappers, for instance the ControlboardWrapper device, which registers a configured instance via deferred attach. In short: it won't work in locally instantiated control boards nor remote network wrappers opened with the subdevice mechanism (ref).

So, what can we do now? Homing procedure will be started via launcher app, as usual, I've just renamed the YARP option to --home (was --homePoss). A homing command can be issued on runtime, too, either via IRemoteCalibrator::homingWholePart() (alternatively, homingSingleJoint(int)) or via RPC command [set] [reca] [homs] (single joint: [set] [reca] [hom] j). In addition, yarpmotorgui enables per-joint and whole part homing buttons.

PeterBowman commented 4 years ago

20190926_191122-0-0-0-1

PeterBowman commented 4 years ago

Ideas:

ICanBusSharer should be deprived from CANopen-specific methods such as start, readyToSwitchOn, switchOn and enable (develop). CAN subdevices might not implement this high-level protocol, hence said methods will always remain unimplemented. I'd rather group this extra functionality so that both CANopen and non-CANopen devices share the same method. In recent WIP commits this is called ICanBusSharer::initialize.
Similarly, a finalization method is required. For instance, CuiAbsolute must issue a stop command in continuous (push) mode. Hence, I'm introducing ICanBusSharer::finalize.
Subdevice initialization workflow runs as follows: device instantiation and initial configuration (via call to PolyDriver::open); pass all necessary handles via API (I'd rather do it this way instead of resorting to pointer manipulation via YARP blob properties), e.g. ICanBusSharer::registerSender(CanSenderDelegate *) (WIP name); call ICanBusSharer::initialize.
Within ICanBusSharer::initialize, a CANopen-compatible device will: 1) configure SDOs; 2) issue a NMT start command to enable PDOs; 3) traverse the CiA 402 state machine to the desired initial state.
VOCAB_CM_NOT_CONFIGURED is meant to be the default YARP control mode value in CANopen-enabled devices.
After the "configure SDOs" step is performed, control mode shall be VOCAB_CM_CONFIGURED.
Upon quitting the ICanBusSharer::initialize step, which comprises the CiA 402 state machine substep, a CANopen drive should remain in the "switched on" internal state and its equivalent YARP control mode vocab be VOCAB_CM_IDLE. In other words, idle mode should translate to an internal state in which no torque/PWM is applied to motors, but as few steps as possible are required to reach the final "enable" state.
Issuing a proper motion control mode, e.g. VOCAB_CM_POSITION, translates into issuing the "enable operation" command in control word so that "enable" status is reached. Note that previously idle mode needed to be active.
VOCAB_CM_HW_FAULT is only meaningful when in "fault" state.
VOCAB_CM_FORCE_IDLE resets the fault bit and performs a VOCAB_CM_IDLE internally, afterwards.
On ICanBusSharer::finalize, "switch on disabled" state should be reached in the shortest path, e.g. issue a "disable voltage" command when in "enable" state.

Remarks:

Not sure what to do with calibration vocabs, I might eventually omit these. I'm introducing a new calibrator device in order to perform homing, but position control mode needs to be set anyway, so VOCAB_CM_CALIBRATING (and especially VOCAB_CM_CALIB_DONE) are a bit meaningless to me.
Current state machine implementation checks whether the requested transition is supported given the current drive state, which forces the app to add multiple checks in this regard. Perhaps it is better to make the API user just point at the desired final state so that this implementation performs all necessary intermediate transitions.
Precisely, VOCAB_CM_IDLE should take care of this fact since three internal states are involved. If in switch on disabled state, this command should perform two transitions. A different treatment should be given when beginning from "ready to switch on" and "enable" state, whereas "switch on" is basically a no-op.

PeterBowman commented 4 years ago

Current state machine implementation checks whether the requested transition is supported given the current drive state, which forces the app to add multiple checks in this regard. Perhaps it is better to make the API user just point at the desired final state so that this implementation performs all necessary intermediate transitions.

Added requestState method at https://github.com/roboticslab-uc3m/yarp-devices/commit/4d492af08bedf5a8450d5dcd4851f524f7c657bd.

PeterBowman commented 4 years ago

Regarding object 6041h, Modes of Operation Display:

If the drive is in an inferior state than Operation enabled and object 6060h Modes of operation is changed, object 6061h will take the value of 6060h only after the drive reached Operation enabled state.

All non-state-related YARP control mode vocabs are only meaningful when in operation enabled state (on get), although it is still possible to request (set) their corresponding iPOS mode of operation via SDO at any time.

The CiA 402 diagram is therefore easier to understand in this way:

Operation enabled state: parse actual mode of operation from statusword and translate into YARP vocab.
Fault states: these are the only ones that need to be listened for via callback.
All remaining states: no callback needed, everything is set by the user and known at all times.

In order to make this work, I'm introducing two variables designing YARP control modes: the requested (set) and the actual control mode (get).

PeterBowman commented 4 years ago

Compare these control mode setters (WIP):

https://github.com/roboticslab-uc3m/yarp-devices/blob/8e4a1dfef9700d0fa43a959583a1b3a08bbf2e6e/libraries/YarpPlugins/TechnosoftIpos/IControlModeRawImpl.cpp#L437-L438

This operation will always update an internal variable containing the actual control mode before this blocking instruction completes. Said update occurs via TPDO callback, therefore the implicit semaphore timeout needs to take into account PDO inhibit times.

https://github.com/roboticslab-uc3m/yarp-devices/blob/8e4a1dfef9700d0fa43a959583a1b3a08bbf2e6e/libraries/YarpPlugins/TechnosoftIpos/IControlModeRawImpl.cpp#L408-L410

Here, however, the SDO indication will only acknowledge that the transfer completes. The associated TPDO (modes of operation display) callback in this case is not being listened to while the SDO operation blocks this thread, although there is indeed a wait-with-timeout involved. So, I'm not sure whether the control mode status variable change or, in other words, the TPDO transfer, will occur during or after the SDO download call. To make it sure, the inhibit time could be set high enough (but not higher than the PDO semaphore timeout), or configure an event timer (which in turn needs to set a tiny period...).

PeterBowman commented 4 years ago

Mostly ready at https://github.com/roboticslab-uc3m/yarp-devices/commit/a0725515ff0edfa066fe1e7fb45c0e6f527fea65, I added a few more checkboxes to https://github.com/roboticslab-uc3m/yarp-devices/pull/229.

In addition to the original intent of this issue, the patch:

adds control mode change on certain operations (e.g. make sure we are in position mode when issuing a positionMove);
deletes ITechnosoftIpos.h and removes CANopen-specific methods in ICanBusSharer.hpp (https://github.com/roboticslab-uc3m/yarp-devices/issues/215#issuecomment-535893408);
doesn't assume that TPDO1 is always configured, now the user must set its callback explicitly;
cleans a few headers, applies modern C++11 features (e.g. override);
moves all internal state variables into a new class within the TechnosoftIpos tree;
replaces SDO control mode requests with TPDO2 event listeners.

PeterBowman commented 4 years ago

A few words regarding ICalibrator and IRemoteCalibrator:

Homing commands move the robot to its initial (zero) position on startup.
Park commands actually move the robot to "home" position, e.g. some convenient pose so that the robot may fit its case (cf.. ASIBOT). Those assume that calibration was performed previously.
Calibration involves these steps:
- reset hardware faults
- set safe PWM limits
- call the calibration procedure via IControlCalibration iface, might move joints to hardware limits
- move joints to zero position
- set idle mode

PeterBowman commented 4 years ago

Regarding fault reset command:

Table 5.1.1 – Drive State Transitions

After leaving the state Fault bit 7, Fault Reset of the Controlword has to be cleared by the host.

Table 5.1.2 – Drive States

The drive remains in fault condition, until it receives a Reset Fault command. If following this command, all the bits from the Motion Error Register are reset, the drive exits the fault state

Table 5.2.1 – Bit Assignment in Controlword

Reset Fault. The faults are reset on 0 to 1 transition of this bit. After a Reset Fault command, the master has to reset this bit.

(further down)

The following table lists the bit combinations for the Controlword that lead to the corresponding state transitions. An X corresponds to a bit state that can be ignored. The single exception is the fault reset: The transition is only started by a bit transition from 0 to 1.

Needs testing: https://github.com/roboticslab-uc3m/yarp-devices/commit/b615fb20d1c11804521ded32386d2896bbf1a154.

Edit: tested and actually works at https://github.com/roboticslab-uc3m/yarp-devices/commit/304ea6d1011410256d8559feacb3737f287c0a91, yay.

PeterBowman commented 4 years ago

Technosoft support answered my questions:

The faults are reset on 0 to 1 transition of this bit. After a "Reset Fault" command, the master has to reset this bit. The "Reset Fault" command reset all the bit from the MER (Motion Error Register), but if the error persist, immediately the bits will be set back and the drive will stay in the "Fault" state. If the error condition it disappeared the drive will transition into "Switch On Disabled" state without clearing the bit. Anyway, when you send the "Shutdown" command the bit will be reset.

If you forget to clear the bit nothing will happen, but when you try to send again the "Reset Fault" command you need first to reset the bit and to set again. The fault command is sent when the transition of this bit from 0 to 1 happens. The slow loop of the drive are working at 1 ms, if you sent two messages in less than 1 ms it is possible the drive to not interpreted the transition.

PeterBowman commented 4 years ago

Follow-up from https://github.com/roboticslab-uc3m/yarp-devices/issues/242:

The device is assumed dead or unreachable (perhaps due to bus failure) if no heartbeat message arrives before a timer runs out. Under such circumstance, VOCAB_CM_NOT_CONFIGURED is set and the device is assumed to perform initial configuration anew on next revival.
VOCAB_CM_CONFIGURED is reached right after initial SDO configuration and before the drive leaves "Switch on disabled" state.

roboticslab-uc3m / yarp-devices

Observe and perform state transitions with YARP control modes #215

3.1. ‘Hardware fault’ status

3.2. ‘Calibration in progress’ status

3.3. ‘Not Configured’ status