roboticslab-uc3m / yarp-devices

A place for YARP devices
https://robots.uc3m.es/yarp-devices/
9 stars 7 forks source link

Make position direct mode usable #198

Closed PeterBowman closed 5 years ago

PeterBowman commented 5 years ago

Current (undesired/faulty) behavior:

jgvictores commented 5 years ago

Related? "[TechnosoftIpos] Cannot change control mode some/many times #170"

PeterBowman commented 5 years ago

Can't really tell unless we determine the underlying cause. Even though it's not stated explicitly in the title, #170 originated when trying to perform [torq] control after one or more mode transitions. I'd keep these two tickets along eachother since a different sequence of steps is described to reproduce their symptoms. Even if this results in a duplicate somewhere, we can point to the same problem from two distinct directions and resolve both at once.

rsantos88 commented 5 years ago

I've doing different tests, changing the mode in different occasions, and executing the position direct example (2ebc388fb44440b09400901a7ee4e2457365bfa3) that I developed. The result has been this:

20190311_131757

Screen output: launchManipulation -> issue-change-mode.txt.zip examplePositionDirect -> examplePositionDirect.txt.zip rpc commands:

teo@teo-oliver:~$ yarp rpc /teo/rightArm/rpc:i 
>>set pos 3 0
Response: [ok]
>>set pos 3 0
Response: [ok]
>>set icmd cmds (pos pos pos pos pos pos pos)
Response: [ok]
>>get icmd cmds
Response: [is] [cmds] ([pos] [pos] [pos] [pos] [pos] [pos] [pos]) [tsta] 10 1552306524.43345 [ok]
>>set pos 3 0
Response: [ok]
>>set pos 0 5
Response: [ok]
>>set pos 0 5
Response: [ok]
>>set pos 1 -5
Response: [ok]
>>set pos 1 -5
Response: [ok]
>>set pos 1 0
Response: [ok]
>>set pos 3 0
Response: [ok]

Steps in the execution:

  1. ExamplePositionDIrect is started (IPositionControl -> IPositionDirect )
  2. We make a change to position mode by RPC of all the right arm
  3. We check that the first time we try to move different joints by RPC, it does not react (joints 0, 1 and 3). We passed the position a second time. It reacts and we move them to 0.
  4. We started the application again. It moves at 10 degrees correctly but then has made a sudden movement (I haven't had time to see it) and I've pressed the emergency button immediately.
PeterBowman commented 5 years ago

Change modes IPositionControl -> IPositionDirect -> IPositionControl and try to positionMove(): need to send this command twice, no response at first try.

See iPOS driver manual, 9.1.1. Internal States.

jgvictores commented 5 years ago

See iPOS driver manual, 9.1.1. Internal States.

Yes, that totally makes sense!

PeterBowman commented 5 years ago

Change modes and move to a certain position after each step, IPositionControl -> IPositionDirect -> IPositionControl -> IPositionDirect. We won't be able to start the PT mode since ptPointCounter is never set to zero:

Done at https://github.com/roboticslab-uc3m/yarp-devices/commit/6dae2643a8b033f52eaa8ac01fe9d3e24c041238, seems to work now.

rsantos88 commented 5 years ago

I'll upload the output of launchManipulation for this issue:

Change modes IPositionControl -> IPositionDirect -> IPositionControl and try to positionMove(): need to send this command twice, no response at first try.

launchManipulation-leftArm-frontal-wrist-1503.txt.zip

PeterBowman commented 5 years ago

launchManipulation-leftArm-frontal-wrist-1503.txt.zip

I'm really baffled that only CAN IDs 21 (left arm) and 27 (head) are responding in the CBW2 state thread, almost no other IDs are shown there (just a few PDO-related lines). Perhaps the acceptance filters are misbehaving?

By the way, I noticed that the status word (6041h) transitions from 8637h to 9237h upon issuing a positionMove() that won't move the robot, in CAN ID 21 (shoulder). According to 8.1.4. Status word in profile position mode, the set-point acknowledge bit is set (= "Trajectory generator will not accept a new set-point.") and the target reached one, reset (= "Target position not reached").

Edit: might be a coincidence (well, it is), but the following three calls issued by setPositionRaw() are interleaved with incoming CAN messages whenever this command fails:

[success] IPositionControl2RawImpl.cpp:33 positionMoveRaw(): Sent "position target". 23 7a 60 0 71 2c 0 0. canId(26) via(600).
[success] IPositionControl2RawImpl.cpp:43 positionMoveRaw(): Sent "start position". 3f 0. canId(26) via(200).
[success] IPositionControl2RawImpl.cpp:55 positionMoveRaw(): Sent "reset position". f 0. canId(26) via(200).

I mean:

[success] IControlMode2RawImpl.cpp:272 getControlModeRaw3(): [success] IPositionControl2RawImpl.cpp:55 positionMoveRaw(): Sent Motion Error Register query. 40 0 20 0 0 0 0 0. canId(21) via(600).
Sent "reset position". f 0. canId(26) via(200).
rsantos88 commented 5 years ago

The same test done now at 12:10 with left wrist (ID26) oneCanBusOneWrapper-posd-2003.txt.zip

rsantos88 commented 5 years ago

Control word test: oneCanBusOneWrapper-posd-2003-cw.txt.zip

PeterBowman commented 5 years ago

Remarks:

PeterBowman commented 5 years ago

Change modes IPositionControl -> IPositionDirect -> IPositionControl and try to positionMove(): need to send this command twice, no response at first try.

Done at https://github.com/roboticslab-uc3m/yarp-devices/commit/2616482c958fc17787884e137880ca2f38b98db4. The "10. Reset the set point." command has been replicated and now is sent prior to setting the target position in positionMoveRaw(). We noticed that 8.3.1. Absolute trapezoidal example assumes the user waits for the motion to complete between "8. Start the profile." and "10. Reset the set point." (see "9. Wait movement to finish.") before sending the next target, therefore our first attempt was to move said reset-related lines to the top of this function. It did work in [pos] control mode, but prevented the trajectory from executing upon switching again to [posd]. As a workaround, we reset the set point twice in positionMoveRaw().

PeterBowman commented 5 years ago

Change modes and move to a certain position after each step

Still broken. Previous (successful) tests did not cover a full arm, rather a single joint. This time, on the second attempt (the first one is always successful, as noted above), we observe that the commanded joint rotates a bit in the beginning, then stops responding to new target points.

rsantos88 commented 5 years ago

Still broken. Previous (successful) tests did not cover a full arm, rather a single joint. This time, on the second attempt (the first one is always successful, as noted above), we observe that the commanded joint rotates a bit in the beginning, then stops responding to new target points.

Detected the change that produces the error in the example. At a low level it would be necessary to see what is failing. If we change pos->positionMove(JOINT, 10) for pos->positionMove(v.data()), the example runs correctly.

rsantos88 commented 5 years ago

Test summary:

rsantos88 commented 5 years ago

Last test, prints current pending reads in CBW2: (updated to OneCanBusOneWrapper) OneCanBusOneWrapper-0904-2.txt.zip

PeterBowman commented 5 years ago

We've learned that the ptBuffer semaphore is the root of all evil. The device simply stops accepting new PT-mode target points since program execution is stuck at this ptBuffer.wait() instruction, which in turn causes YARP to complain about a growing queue of (unprocessed) messages. We believe that this semaphore might not be properly released on the first run of examplePositionDirect (relevant lines). Also, note that nothing prevents us from calling post() more than once per wait() call, that is, the internal counter of the semaphore may exceed the value 1. In the log file previously attached, it's worth noting that the "pt buffer full" message is not symmetric to "pt buffer empty" (just 3 occurrences of the latter on the second run) and relates to IDs 24/25/26, mostly.

This is highly undesirable and leads to undefined behavior, to say the least. I would rethink the whole buffer full/empty handling mechanism. See https://github.com/roboticslab-uc3m/teo-bimanipulation/issues/3#issuecomment-444611272.

jgvictores commented 5 years ago

Attaching the oldest code I've found: locomotion.zip (from private repo which has some interesting stuff)

PeterBowman commented 5 years ago

Regarding

https://github.com/roboticslab-uc3m/yarp-devices/blob/a023145ed8a77cbeca0109c405855179654bf282/libraries/YarpPlugins/TechnosoftIpos/ICanBusSharerImpl.cpp#L255-L266

and 9.2.3. Object 2072h: Interpolated position mode status:

Remark: when a status bit changes from this object, an emergency message with the code 0xFF01 will be generated. This emergency message will have mapped object 2072h data onto bytes 3 and 4.

Shouldn't the "pt buffer empty" check for value 80h, instead? That is, when bit 15 is set: "Buffer is empty – there is no point in the buffer."

Edit: this actually might explain why we keep seeing "pt buffer empty" messages next to a "pt buffer full", which makes little sense (the buffer shouldn never deplete so fast). Also, I believe the PT examples in the iPOS manual are wrong regarding the buffer low warning, note that we never enable bit 7 in 2074h (section 9.2.5), the example neither. This would imply that: 1. perhaps buffer low's default is 00h; 2. the MSB in 2072h is almost always zero (does it make sense, shouldn't that produce constant buffer-empty messages?).

PeterBowman commented 5 years ago

Attaching the oldest code I've found: locomotion.zip (from private repo which has some interesting stuff)

This code set buffer-low signaling value to 4, buffer-empty was never checked against. There was an attempt to change the default buffer size (full) to 10792 (2A28h), which was ultimately commented out.

Considering that the buffer-empty state might not be currently reached in normal PT-mode operation (see previous comment), we don't know whether intermitent depletion of the buffer could entail undesirable effects (@jgvictores).

rsantos88 commented 5 years ago

Test with oneCanBusOneWrapper, prints the current changes of branch https://github.com/roboticslab-uc3m/yarp-devices/tree/verbose-posd with the position direct example. File: OneCanBusOneWrapper-1004.txt.zip

PeterBowman commented 5 years ago

Shouldn't the "pt buffer empty" check look for value 80h, instead? That is, bit 15 set: "Buffer is empty – there is no point in the buffer."

Well, I can confirm the buffer is definitely not empty until the very end of the motion sequence. The buffer-low bit is set, too, in this case. After the last setPosition() call, and around 6-7 cycles later, a single buffer-low message is issued, then buffer-empty steps in. Also, on the second (failed) run, the semaphore gets stuck right after signalling a buffer-full state.

PeterBowman commented 5 years ago

Upon reception, each PT point is stored in a reception buffer. The reference generator empties the buffer as the PT points are executed. The drive/motor automatically sends warning messages when the buffer is full, low or empty. The buffer full condition occurs when the number of PT points in the buffer is equal with the buffer size. The buffer low condition occurs when the number of PT points in the buffer is less or equal with a programmable value. The buffer empty condition occurs when the buffer is empty and the execution of the last PT point is over.

The PT buffer size is programmable and if needed can be substantially increased. By default it is set to 7 PT points.

smcdiaz commented 5 years ago

Good comment. I don't remember if I talked about that with you or only with @rsantos88 . Using these messages would be a good method for controlling movement and avoiding interruptions.

PeterBowman commented 5 years ago

Ideas:

PeterBowman commented 5 years ago

External Reference Position Mode: perhaps?

Tested today, we learned that its behavior closely resembles our IPositionDirect interface implementation in OpenRAVE: try to move as fast as possible to the demanded position. We did not manage to make the speed limit work.

PeterBowman commented 5 years ago

Useful robotology PDFs regarding control mode implementation in the iCUB robot: doc (slides). Source: http://wiki.icub.org/wiki/Control_Modes.

jgvictores commented 5 years ago

Useful robotology PDFs regarding control mode implementation in the iCUB robot: doc (slides). Source: http://wiki.icub.org/wiki/Control_Modes.

@PeterBowman Thanks! Added to https://github.com/robotology/yarp/issues/1105

PeterBowman commented 5 years ago

I reached out Technosoft support team and learned that:

PeterBowman commented 5 years ago

The Technosoft iPOS CANopen Programming User Manual (ref. P091.063.iPOS.STO.UM.0117, 2017) adds a Cyclic Synchronous Position mode (CSP) in section 10:

With this mode, the trajectory generator is located in the control device, not in the drive device. In cyclic synchronous manner, it provides a target position to the drive device, which performs position control, velocity control and torque control. Measured by sensors, the drive provides actual values for position, velocity and torque to the control device.

Section 9.2.1, Object 60C0h: Interpolation sub mode select (in Interpolated Position Mode), states that PT and PVT submodes should be regarded as legacy stuff. The preferred submode is introduced as "linear interpolation":

Linear interpolation as described in the CiA 402 standard (when object 208Eh bit8=1); This mode is almost identical with Cyclic Synchronous Position mode, only that it receives its position data into 60C1h sub-index 01 instead of object 607Ah. No interpolation point buffer will be used.

From About This Manual:

The iPOS drives are conforming to CiA 301 v4.2 application layer and communication profile, CiA WD 305 v.2.2.131 Layer Setting Services and to CiA (DSP) 402 v4.0 device profile for drives and motion control, now included in IEC 61800-7-1 Annex A, IEC 61800-7-201 and IEC 61800-7-301 standards.

P091.063.iPOS.UM.0615 (2015) states that:

The iPOS drives are confirming to CiA 301 v4.2 application layer and communication profile, CiA WD 305 v.2.2.131 Layer Setting Services and to CiA DSP 402 v3.0 device profile for drives and motion control (...)

So, no CSP mode in CiA DSP 402 v3.0. We'd probably need to perform a firmware update on our drives in order to use this mode.

Interestingly, from section 10.2.1, Object 60C2h: Interpolation time period:

The Interpolation time period indicates the configured interpolation cycle time. Its value must be set with the time value of the CANopen master communication cycle time and sync time in order for the Cyclic Synchronous Position mode to work properly.

Remark: due to the limitations of the CAN network, it is recommended that the interpolation time period should not be set lower than 4 ms.

Besides, in 10.1.1, Controlword in Cyclic Synchronous Position mode (CSP):

In Relative position mode, the drive will add to its current position the value received in object 607Ah. By sending this value periodically and setting the correct interpolation period time in object 60C2h, it will be like working in Cyclic Synchronous Velocity mode (CSV).

PeterBowman commented 5 years ago

Found by @jgvictores at https://www.technosoftmotion.com/en/features (beware: 2006):

Position and speed control with 100 ms update rate - for very high dynamic applications a new control strategy has been added. This performs the position and speed control with an update rate of 100 ms, and does linear interpolation between the reference points provided by the reference generator at each 1ms.

PeterBowman commented 5 years ago

As I understand it, the external reference position mode is stupid in that is doesn't perform any control (position, velocity, current) nor interpolation on the drive side. Position profile mode places the burden of control and trajectory generation on the drive. There is also a need for interpolation (i.e. trajectory generation) on legacy PT/PVT modes, plus an input buffer is supported. Control is performed by the drive in CSP mode, but the application (CAN master) is now responsible for the trajectory/interpolation step.

Perhaps we want CSP mode instead of external reference position (and CSV instead of external reference speed). My hunch is that CSP/CSV is harder to master - according to the examples, we'd be forced to send messages at the precise rate (regarded both as the interpolation time and the SYNC signal) from the CAN master to the drives. However, YARP comms are involved in this scenario - in our case, trajectory generation occurs outside the robot.

Another remark: we like PT/PVT since it resembles RT behavior. Yes, but this is only a concern of single joints - equally time-spaced points are going to be executed per joint, but synchronization across joints is not guaranteed. Perhaps the SYNC signal could be involved in this task.

PeterBowman commented 5 years ago

Current plan:

Side quests:

jgvictores commented 5 years ago

Yes, I guess we may end up with very similar code on that side. Cannot find good alternative.

PeterBowman commented 5 years ago

We did not manage to make the speed limit work.

Fixed. YARP's position-direct mode is now implemented with the iPOS-specific external reference position mode, was tested successfully today. WIP in the nuke-posd branch, commit https://github.com/roboticslab-uc3m/yarp-devices/commit/622dbf0de971f06f9d154ebe3732bb2ee264cdcb might be merged into develop soon.

PeterBowman commented 5 years ago

https://github.com/roboticslab-uc3m/yarp-devices/issues/198#issuecomment-487386797

Repurpose the IPositionDirect implementation, command the iPOS drives in the external reference position mode, take care of setting the speed limitation.

Done, awaiting some final tests.

Enable PVT interpolation via IRemoteVariables:

Split into https://github.com/roboticslab-uc3m/yarp-devices/issues/208. This issue only focuses on posd mode, I've updated the description accordingly.

PeterBowman commented 5 years ago

Done, awaiting some final tests.

For the sake of caution, no tests shall be performed until - and including - next Thursday, May 9th (@rsantos88). A video recording will be held on that date.

After that, I'd like to test extensively our new online trajectory execution with some kind of joystick (see https://github.com/roboticslab-uc3m/kinematics-dynamics/issues/173). Idea: dynamically adjust the maximum speed limitation on the driver side (could be done by the client, but I'd rather do this over the CAN network rather than via Ethernet).

PeterBowman commented 5 years ago

After that, I'd like to test extensively our new online trajectory execution with some kind of joystick

Ran into this: "No response to user commands after the application is relaunched." (https://github.com/roboticslab-uc3m/kinematics-dynamics/issues/173#issuecomment-493443729).

Idea: dynamically adjust the maximum speed limitation on the driver side (could be done by the client, but I'd rather do this over the CAN network rather than via Ethernet).

The CAN master would need to take account of the frequency at which commands are sent. This statement does not cope well with my reasoning about velocity checks at https://github.com/roboticslab-uc3m/kinematics-dynamics/issues/173#issuecomment-493191279, so I'd need to take another look at it.

PeterBowman commented 5 years ago

New posd implementation merged into develop at https://github.com/roboticslab-uc3m/yarp-devices/commit/36f8fad94ea34c5046491540d3361d7e84e73de8, not fully ready yet (see previous comment).

PeterBowman commented 5 years ago

Found by @jgvictores at https://www.technosoftmotion.com/en/features (beware: 2006):

Position and speed control with 100 ms update rate - for very high dynamic applications a new control strategy has been added. This performs the position and speed control with an update rate of 100 ms, and does linear interpolation between the reference points provided by the reference generator at each 1ms.

According to https://github.com/roboticslab-uc3m/yarp-devices/issues/210#issuecomment-493665173, the sampling period for fast loop and slow loop control is 0.1 and 1 ms, respectively.

PeterBowman commented 5 years ago

Idea: dynamically adjust the maximum speed limitation on the driver side (could be done by the client, but I'd rather do this over the CAN network rather than via Ethernet).

Also, blocked by https://github.com/roboticslab-uc3m/yarp-devices/issues/188.

PeterBowman commented 5 years ago

Even if the reference speed is updated while in external reference position mode, its new values are ignored by the drive upon trying to reach the next targets. Also, this mode is sometimes unresponsive upon pos->posd->pos->posd, not so often when running the examplePositionDirect app.

Considering to switch back to PT mode for online trajectories, leaving PVT aside for offline stuff.

PeterBowman commented 5 years ago

Considering to switch back to PT mode for online trajectories, leaving PVT aside for offline stuff.

Implemented and tested. I'm working on this along with https://github.com/roboticslab-uc3m/yarp-devices/issues/208 since this solution uses the IRemoteVariables YARP interface to pass a ptModeMs parameter over the network: <0 means "not in position direct mode" (default), 0 means "use external reference position mode", >0 means "use PT interpolation mode with this period (in milliseconds)".

I noticed that device routing in the ControlBoardWrapper layer is kinda messy regarding how IRemoteVariables methods are interpreting incoming/outgoing bottles, see source code. For instance, we'd need to build a two-element bottle if we ever want to send a remote variable to the head part, one element per CanBusControlboard instance. This won't affect both arms and legs since there is a one-to-one mapping (controlboardwrapper2-to-CanBusControlboard), but we are still forced to wrap any data sent in an additional bottle (e.g. (50) instead of just 50 for ptModeMs).

In addition, this behavior introduces an assymetry in that said bottle would need to be constructed in a slightly different manner in case the CanBusControlboard device is not attached to a network wrapper.

PeterBowman commented 5 years ago

Recent tests showed that online trajectories performed in the new position direct mode, (re)implemented as PT interpolation, suddenly stop roughly at the 150th-200th commanded target. The drives had turned unresponsive after that point: no lines received in the CAN read stream (interpretMessage) despite reporting apparently successful status queries. A close inspection in EasyMotion did not reveal much - I could only learn that all four position trigger bits in object 1002h (Manufacturer Status Register) had been toggled to zero. This behavior was observed in all drives accounting for the right arm (firmware F508D) except id 19 (firmware F508F).

Everything worked like a charm after I commented out the bits that set the PT buffer length to its maximum value as detailed in https://github.com/roboticslab-uc3m/yarp-devices/issues/198#issuecomment-487279910. This buffer has never been modified in that way before, even if we thought it was modified by us to store 15 points by the following lines:

https://github.com/roboticslab-uc3m/yarp-devices/blob/e26db48e8a0d6582df8ca18583b55f8b3bfba898/libraries/YarpPlugins/TechnosoftIpos/IControlModeRawImpl.cpp#L211-L215

This has been fixed in a recent revision of the iPOS CAN user manual: should have read object 2073h (Interpolated position buffer length) instead of 2074h (Interpolated position buffer configuration).

rsantos88 commented 5 years ago

Reminder: When I do pos->posd->pos->posd, it doesn't change to PT mode and the launchManipulation shows:

[debug] IRemoteVariablesImpl.cpp:50 setRemoteVariable(): ptModeMs
[debug] IRemoteVariablesRawImpl.cpp:37 setRemoteVariableRaw(): ptModeMs
[warning] IRemoteVariablesRawImpl.cpp:45 setRemoteVariableRaw(): Illegal state, unable to change variable when currently used.
[warning] IRemoteVariablesImpl.cpp:58 setRemoteVariable(): Unable to set remote variable on node 0.
PeterBowman commented 5 years ago

Could you upload the latest code? I don't see the changes we had to introduce in this method. There is a possibility the CAN network has nothing to do here, perhaps a state variable is not being correctly cleared on the iPOS-YARP device side, or some tweaks need to be made in your app.

rsantos88 commented 5 years ago

Could you upload the latest code? I don't see the changes we had to introduce in this method. There is a possibility the CAN network has nothing to do here, perhaps a state variable is not being correctly cleared on the iPOS-YARP device side, or some tweaks need to be made in your app.

https://github.com/roboticslab-uc3m/teo-bimanipulation/commit/bc0a8e12ffc0fa40962def68994d9910d98540cd

PeterBowman commented 5 years ago

Please run your applicacion twice, make sure this error happens on the second run, and paste here the complete logs of launchManipulation.

rsantos88 commented 5 years ago

Here you have the output information of launchManipulation: launchManipulation-pvt-test-3005.txt.zip

And the output of my app: output.txt.

PeterBowman commented 5 years ago

Identified at https://github.com/roboticslab-uc3m/yarp-devices/issues/211#issuecomment-497280755. Suggested workaround: edit launchManipulation.ini, remove head joints.