Closed PeterBowman closed 5 years ago
Related? "[TechnosoftIpos] Cannot change control mode some/many times #170"
Can't really tell unless we determine the underlying cause. Even though it's not stated explicitly in the title, #170 originated when trying to perform [torq]
control after one or more mode transitions. I'd keep these two tickets along eachother since a different sequence of steps is described to reproduce their symptoms. Even if this results in a duplicate somewhere, we can point to the same problem from two distinct directions and resolve both at once.
I've doing different tests, changing the mode in different occasions, and executing the position direct example (2ebc388fb44440b09400901a7ee4e2457365bfa3) that I developed. The result has been this:
Screen output: launchManipulation -> issue-change-mode.txt.zip examplePositionDirect -> examplePositionDirect.txt.zip rpc commands:
teo@teo-oliver:~$ yarp rpc /teo/rightArm/rpc:i
>>set pos 3 0
Response: [ok]
>>set pos 3 0
Response: [ok]
>>set icmd cmds (pos pos pos pos pos pos pos)
Response: [ok]
>>get icmd cmds
Response: [is] [cmds] ([pos] [pos] [pos] [pos] [pos] [pos] [pos]) [tsta] 10 1552306524.43345 [ok]
>>set pos 3 0
Response: [ok]
>>set pos 0 5
Response: [ok]
>>set pos 0 5
Response: [ok]
>>set pos 1 -5
Response: [ok]
>>set pos 1 -5
Response: [ok]
>>set pos 1 0
Response: [ok]
>>set pos 3 0
Response: [ok]
Steps in the execution:
Change modes
IPositionControl -> IPositionDirect -> IPositionControl
and try topositionMove()
: need to send this command twice, no response at first try.
See iPOS driver manual, 9.1.1. Internal States.
See iPOS driver manual, 9.1.1. Internal States.
Yes, that totally makes sense!
Change modes and move to a certain position after each step,
IPositionControl -> IPositionDirect -> IPositionControl -> IPositionDirect
. We won't be able to start the PT mode sinceptPointCounter
is never set to zero:
Done at https://github.com/roboticslab-uc3m/yarp-devices/commit/6dae2643a8b033f52eaa8ac01fe9d3e24c041238, seems to work now.
I'll upload the output of launchManipulation for this issue:
Change modes IPositionControl -> IPositionDirect -> IPositionControl and try to positionMove(): need to send this command twice, no response at first try.
I'm really baffled that only CAN IDs 21 (left arm) and 27 (head) are responding in the CBW2 state thread, almost no other IDs are shown there (just a few PDO-related lines). Perhaps the acceptance filters are misbehaving?
By the way, I noticed that the status word (6041h) transitions from 8637h to 9237h upon issuing a positionMove()
that won't move the robot, in CAN ID 21 (shoulder). According to 8.1.4. Status word in profile position mode, the set-point acknowledge bit is set (= "Trajectory generator will not accept a new set-point.") and the target reached one, reset (= "Target position not reached").
Edit: might be a coincidence (well, it is), but the following three calls issued by setPositionRaw()
are interleaved with incoming CAN messages whenever this command fails:
[success] IPositionControl2RawImpl.cpp:33 positionMoveRaw(): Sent "position target". 23 7a 60 0 71 2c 0 0. canId(26) via(600).
[success] IPositionControl2RawImpl.cpp:43 positionMoveRaw(): Sent "start position". 3f 0. canId(26) via(200).
[success] IPositionControl2RawImpl.cpp:55 positionMoveRaw(): Sent "reset position". f 0. canId(26) via(200).
I mean:
[success] IControlMode2RawImpl.cpp:272 getControlModeRaw3(): [success] IPositionControl2RawImpl.cpp:55 positionMoveRaw(): Sent Motion Error Register query. 40 0 20 0 0 0 0 0. canId(21) via(600).
Sent "reset position". f 0. canId(26) via(200).
The same test done now at 12:10 with left wrist (ID26) oneCanBusOneWrapper-posd-2003.txt.zip
>>get encs
Response: [is] encs (20.003027) [tsta] 3 1553080045.375092 [ok]
>>get icmd cmds
Response: [is] [cmds] ([posd]) [tsta] 4 1553080049.250982 [ok]
>>set icmd cmod 0 pos
Response: [ok]
>>get icmd cmds
Response: [is] [cmds] ([pos]) [tsta] 5 1553080057.332439 [ok]
>>set pos 0 0 (not working)
Response: [ok]
>>set pos 0 0 (working)
Response: [ok]
Control word test: oneCanBusOneWrapper-posd-2003-cw.txt.zip
Remarks:
positionMove
, the status word transitions from 96xxh to 86xxh (9h means that the set-point acknowledge bit is set = "Trajectory generator will not accept a new set-point.")positionMove
, the status word transitions from 86xxh to 92xxh (2h means that the target reached bit is reset = "Target position not reached", I guess the motor is still rotating).positionMove
is issued. Bit 8 means Halt:
Change modes
IPositionControl -> IPositionDirect -> IPositionControl
and try topositionMove()
: need to send this command twice, no response at first try.
Done at https://github.com/roboticslab-uc3m/yarp-devices/commit/2616482c958fc17787884e137880ca2f38b98db4. The "10. Reset the set point." command has been replicated and now is sent prior to setting the target position in positionMoveRaw()
. We noticed that 8.3.1. Absolute trapezoidal example assumes the user waits for the motion to complete between "8. Start the profile." and "10. Reset the set point." (see "9. Wait movement to finish.") before sending the next target, therefore our first attempt was to move said reset-related lines to the top of this function. It did work in [pos] control mode, but prevented the trajectory from executing upon switching again to [posd]. As a workaround, we reset the set point twice in positionMoveRaw()
.
Change modes and move to a certain position after each step
Still broken. Previous (successful) tests did not cover a full arm, rather a single joint. This time, on the second attempt (the first one is always successful, as noted above), we observe that the commanded joint rotates a bit in the beginning, then stops responding to new target points.
Still broken. Previous (successful) tests did not cover a full arm, rather a single joint. This time, on the second attempt (the first one is always successful, as noted above), we observe that the commanded joint rotates a bit in the beginning, then stops responding to new target points.
Detected the change that produces the error in the example. At a low level it would be necessary to see what is failing. If we change pos->positionMove(JOINT, 10) for pos->positionMove(v.data()), the example runs correctly.
Test summary:
POSITION
mode and we change also all the joints to POSITION DIRECT
positionMove(JOINT, 10)
(2) setPosition(JOINT, encValue)
:heavy_check_mark:
(1) positionMove(JOINT, 10)
(2) setPosition(jointPosition.data())
:x:
(1) positionMove(jointPosition.data())
(2) setPosition(JOINT, encValue))
:heavy_check_mark:
(1) positionMove(jointPosition.data())
(2) setPositions(jointPosition.data())
:heavy_check_mark:Last test, prints current pending reads in CBW2: (updated to OneCanBusOneWrapper) OneCanBusOneWrapper-0904-2.txt.zip
We've learned that the ptBuffer
semaphore is the root of all evil. The device simply stops accepting new PT-mode target points since program execution is stuck at this ptBuffer.wait()
instruction, which in turn causes YARP to complain about a growing queue of (unprocessed) messages. We believe that this semaphore might not be properly released on the first run of examplePositionDirect (relevant lines). Also, note that nothing prevents us from calling post()
more than once per wait()
call, that is, the internal counter of the semaphore may exceed the value 1
. In the log file previously attached, it's worth noting that the "pt buffer full" message is not symmetric to "pt buffer empty" (just 3 occurrences of the latter on the second run) and relates to IDs 24/25/26, mostly.
This is highly undesirable and leads to undefined behavior, to say the least. I would rethink the whole buffer full/empty handling mechanism. See https://github.com/roboticslab-uc3m/teo-bimanipulation/issues/3#issuecomment-444611272.
Attaching the oldest code I've found: locomotion.zip (from private repo which has some interesting stuff)
Regarding
and 9.2.3. Object 2072h: Interpolated position mode status:
Remark: when a status bit changes from this object, an emergency message with the code 0xFF01 will be generated. This emergency message will have mapped object 2072h data onto bytes 3 and 4.
Shouldn't the "pt buffer empty" check for value 80h, instead? That is, when bit 15 is set: "Buffer is empty – there is no point in the buffer."
Edit: this actually might explain why we keep seeing "pt buffer empty" messages next to a "pt buffer full", which makes little sense (the buffer shouldn never deplete so fast). Also, I believe the PT examples in the iPOS manual are wrong regarding the buffer low warning, note that we never enable bit 7 in 2074h (section 9.2.5), the example neither. This would imply that: 1. perhaps buffer low's default is 00h; 2. the MSB in 2072h is almost always zero (does it make sense, shouldn't that produce constant buffer-empty messages?).
Attaching the oldest code I've found: locomotion.zip (from private repo which has some interesting stuff)
This code set buffer-low signaling value to 4, buffer-empty was never checked against. There was an attempt to change the default buffer size (full) to 10792 (2A28h), which was ultimately commented out.
Considering that the buffer-empty state might not be currently reached in normal PT-mode operation (see previous comment), we don't know whether intermitent depletion of the buffer could entail undesirable effects (@jgvictores).
Test with oneCanBusOneWrapper
, prints the current changes of branch https://github.com/roboticslab-uc3m/yarp-devices/tree/verbose-posd with the position direct example.
File: OneCanBusOneWrapper-1004.txt.zip
Shouldn't the "pt buffer empty" check look for value 80h, instead? That is, bit 15 set: "Buffer is empty – there is no point in the buffer."
Well, I can confirm the buffer is definitely not empty until the very end of the motion sequence. The buffer-low bit is set, too, in this case. After the last setPosition()
call, and around 6-7 cycles later, a single buffer-low message is issued, then buffer-empty steps in. Also, on the second (failed) run, the semaphore gets stuck right after signalling a buffer-full state.
Upon reception, each PT point is stored in a reception buffer. The reference generator empties the buffer as the PT points are executed. The drive/motor automatically sends warning messages when the buffer is full, low or empty. The buffer full condition occurs when the number of PT points in the buffer is equal with the buffer size. The buffer low condition occurs when the number of PT points in the buffer is less or equal with a programmable value. The buffer empty condition occurs when the buffer is empty and the execution of the last PT point is over.
The PT buffer size is programmable and if needed can be substantially increased. By default it is set to 7 PT points.
Good comment. I don't remember if I talked about that with you or only with @rsantos88 . Using these messages would be a good method for controlling movement and avoiding interruptions.
Ideas:
Use an intermediate buffer on the server (device) side in order to decouple receive-from-yarp-network and send-to-ipos-drive operations. Alternatively, increase substantially the internal buffer size. Buffer state management is needed, anyway, so that the motion start signal is sent at the right time. This buffer would make sense in an offline trajectory scenario, that is, an external application might send lots of target points in large bursts, even hundreds or thousands at once, as long as YARP comms can handle that. In this manner, the latency of both YARP and CAN networks is entirely avoided. However, I would strive to avoid buffering altogether when processing online trajectories. Keep in mind that we feed this internal buffer with points and don't execute the movement until it's empty - that is, we are introducing a rather big delay between the instant the first target is meant to be executed and the actual start of the motion (buffer_size*T).
Either enforce a PT interval (currently 50 milliseconds) that external apps should download and adjust interpolation input data to (online/offline), or process this data as it goes in said app and defer proper interpolation to the drive layer, i.e. let the TechnosoftIpos device choose the right PT interval. The former feels like reinventing the wheel and moves the burden of sending the exact points in the right time intervals to the client app. The latter forces our TechnosoftIpos to process input targets prior to forwarding them to the iPOS device via CAN, but avoids duplication (this task is accomplished in one place, external apps don't need to worry about the exact protocol).
Perhaps other modes of operation may render more suitable for our needs than PT (5.2.4. Object 6060h: Modes of Operation):
External Reference Position Mode: perhaps?
Tested today, we learned that its behavior closely resembles our IPositionDirect interface implementation in OpenRAVE: try to move as fast as possible to the demanded position. We did not manage to make the speed limit work.
Useful robotology PDFs regarding control mode implementation in the iCUB robot: doc (slides). Source: http://wiki.icub.org/wiki/Control_Modes.
Useful robotology PDFs regarding control mode implementation in the iCUB robot: doc (slides). Source: http://wiki.icub.org/wiki/Control_Modes.
@PeterBowman Thanks! Added to https://github.com/robotology/yarp/issues/1105
I reached out Technosoft support team and learned that:
The Technosoft iPOS CANopen Programming User Manual (ref. P091.063.iPOS.STO.UM.0117, 2017) adds a Cyclic Synchronous Position mode (CSP) in section 10:
With this mode, the trajectory generator is located in the control device, not in the drive device. In cyclic synchronous manner, it provides a target position to the drive device, which performs position control, velocity control and torque control. Measured by sensors, the drive provides actual values for position, velocity and torque to the control device.
Section 9.2.1, Object 60C0h: Interpolation sub mode select (in Interpolated Position Mode), states that PT and PVT submodes should be regarded as legacy stuff. The preferred submode is introduced as "linear interpolation":
Linear interpolation as described in the CiA 402 standard (when object 208Eh bit8=1); This mode is almost identical with Cyclic Synchronous Position mode, only that it receives its position data into 60C1h sub-index 01 instead of object 607Ah. No interpolation point buffer will be used.
From About This Manual:
The iPOS drives are conforming to CiA 301 v4.2 application layer and communication profile, CiA WD 305 v.2.2.131 Layer Setting Services and to CiA (DSP) 402 v4.0 device profile for drives and motion control, now included in IEC 61800-7-1 Annex A, IEC 61800-7-201 and IEC 61800-7-301 standards.
P091.063.iPOS.UM.0615 (2015) states that:
The iPOS drives are confirming to CiA 301 v4.2 application layer and communication profile, CiA WD 305 v.2.2.131 Layer Setting Services and to CiA DSP 402 v3.0 device profile for drives and motion control (...)
So, no CSP mode in CiA DSP 402 v3.0. We'd probably need to perform a firmware update on our drives in order to use this mode.
Interestingly, from section 10.2.1, Object 60C2h: Interpolation time period:
The Interpolation time period indicates the configured interpolation cycle time. Its value must be set with the time value of the CANopen master communication cycle time and sync time in order for the Cyclic Synchronous Position mode to work properly.
Remark: due to the limitations of the CAN network, it is recommended that the interpolation time period should not be set lower than 4 ms.
Besides, in 10.1.1, Controlword in Cyclic Synchronous Position mode (CSP):
In Relative position mode, the drive will add to its current position the value received in object 607Ah. By sending this value periodically and setting the correct interpolation period time in object 60C2h, it will be like working in Cyclic Synchronous Velocity mode (CSV).
Found by @jgvictores at https://www.technosoftmotion.com/en/features (beware: 2006):
Position and speed control with 100 ms update rate - for very high dynamic applications a new control strategy has been added. This performs the position and speed control with an update rate of 100 ms, and does linear interpolation between the reference points provided by the reference generator at each 1ms.
As I understand it, the external reference position mode is stupid in that is doesn't perform any control (position, velocity, current) nor interpolation on the drive side. Position profile mode places the burden of control and trajectory generation on the drive. There is also a need for interpolation (i.e. trajectory generation) on legacy PT/PVT modes, plus an input buffer is supported. Control is performed by the drive in CSP mode, but the application (CAN master) is now responsible for the trajectory/interpolation step.
Perhaps we want CSP mode instead of external reference position (and CSV instead of external reference speed). My hunch is that CSP/CSV is harder to master - according to the examples, we'd be forced to send messages at the precise rate (regarded both as the interpolation time and the SYNC signal) from the CAN master to the drives. However, YARP comms are involved in this scenario - in our case, trajectory generation occurs outside the robot.
Another remark: we like PT/PVT since it resembles RT behavior. Yes, but this is only a concern of single joints - equally time-spaced points are going to be executed per joint, but synchronization across joints is not guaranteed. Perhaps the SYNC signal could be involved in this task.
Current plan:
Repurpose the IPositionDirect implementation, command the iPOS drives in the external reference position mode, take care of setting the speed limitation.
Enable PVT interpolation via IRemoteVariables:
VOCAB_CM_MIXED
a.k.a. mixed mode (iCub reference).IRemoteVariables::setRemoteVariable
.IPositionControl::checkMotionDone
and ::stop
(https://github.com/roboticslab-uc3m/yarp-devices/issues/120) in this mode.Side quests:
BasicCartesianControl::movl
(and others).
- Implement offline trajectories in YarpOpenraveControlboard.
Yes, I guess we may end up with very similar code on that side. Cannot find good alternative.
We did not manage to make the speed limit work.
Fixed. YARP's position-direct mode is now implemented with the iPOS-specific external reference position mode, was tested successfully today. WIP in the nuke-posd branch, commit https://github.com/roboticslab-uc3m/yarp-devices/commit/622dbf0de971f06f9d154ebe3732bb2ee264cdcb might be merged into develop soon.
https://github.com/roboticslab-uc3m/yarp-devices/issues/198#issuecomment-487386797
Repurpose the IPositionDirect implementation, command the iPOS drives in the external reference position mode, take care of setting the speed limitation.
Done, awaiting some final tests.
Enable PVT interpolation via IRemoteVariables:
Split into https://github.com/roboticslab-uc3m/yarp-devices/issues/208. This issue only focuses on posd mode, I've updated the description accordingly.
Done, awaiting some final tests.
For the sake of caution, no tests shall be performed until - and including - next Thursday, May 9th (@rsantos88). A video recording will be held on that date.
After that, I'd like to test extensively our new online trajectory execution with some kind of joystick (see https://github.com/roboticslab-uc3m/kinematics-dynamics/issues/173). Idea: dynamically adjust the maximum speed limitation on the driver side (could be done by the client, but I'd rather do this over the CAN network rather than via Ethernet).
After that, I'd like to test extensively our new online trajectory execution with some kind of joystick
Ran into this: "No response to user commands after the application is relaunched." (https://github.com/roboticslab-uc3m/kinematics-dynamics/issues/173#issuecomment-493443729).
Idea: dynamically adjust the maximum speed limitation on the driver side (could be done by the client, but I'd rather do this over the CAN network rather than via Ethernet).
The CAN master would need to take account of the frequency at which commands are sent. This statement does not cope well with my reasoning about velocity checks at https://github.com/roboticslab-uc3m/kinematics-dynamics/issues/173#issuecomment-493191279, so I'd need to take another look at it.
New posd implementation merged into develop at https://github.com/roboticslab-uc3m/yarp-devices/commit/36f8fad94ea34c5046491540d3361d7e84e73de8, not fully ready yet (see previous comment).
Found by @jgvictores at https://www.technosoftmotion.com/en/features (beware: 2006):
Position and speed control with 100 ms update rate - for very high dynamic applications a new control strategy has been added. This performs the position and speed control with an update rate of 100 ms, and does linear interpolation between the reference points provided by the reference generator at each 1ms.
According to https://github.com/roboticslab-uc3m/yarp-devices/issues/210#issuecomment-493665173, the sampling period for fast loop and slow loop control is 0.1 and 1 ms, respectively.
Idea: dynamically adjust the maximum speed limitation on the driver side (could be done by the client, but I'd rather do this over the CAN network rather than via Ethernet).
Also, blocked by https://github.com/roboticslab-uc3m/yarp-devices/issues/188.
Even if the reference speed is updated while in external reference position mode, its new values are ignored by the drive upon trying to reach the next targets. Also, this mode is sometimes unresponsive upon pos->posd->pos->posd, not so often when running the examplePositionDirect app.
Considering to switch back to PT mode for online trajectories, leaving PVT aside for offline stuff.
Considering to switch back to PT mode for online trajectories, leaving PVT aside for offline stuff.
Implemented and tested. I'm working on this along with https://github.com/roboticslab-uc3m/yarp-devices/issues/208 since this solution uses the IRemoteVariables
YARP interface to pass a ptModeMs
parameter over the network: <0
means "not in position direct mode" (default), 0
means "use external reference position mode", >0
means "use PT interpolation mode with this period (in milliseconds)".
I noticed that device routing in the ControlBoardWrapper layer is kinda messy regarding how IRemoteVariables methods are interpreting incoming/outgoing bottles, see source code. For instance, we'd need to build a two-element bottle if we ever want to send a remote variable to the head part, one element per CanBusControlboard instance. This won't affect both arms and legs since there is a one-to-one mapping (controlboardwrapper2-to-CanBusControlboard), but we are still forced to wrap any data sent in an additional bottle (e.g. (50)
instead of just 50
for ptModeMs
).
In addition, this behavior introduces an assymetry in that said bottle would need to be constructed in a slightly different manner in case the CanBusControlboard device is not attached to a network wrapper.
Recent tests showed that online trajectories performed in the new position direct mode, (re)implemented as PT interpolation, suddenly stop roughly at the 150th-200th commanded target. The drives had turned unresponsive after that point: no lines received in the CAN read stream (interpretMessage
) despite reporting apparently successful status queries. A close inspection in EasyMotion did not reveal much - I could only learn that all four position trigger bits in object 1002h (Manufacturer Status Register) had been toggled to zero. This behavior was observed in all drives accounting for the right arm (firmware F508D) except id 19 (firmware F508F).
Everything worked like a charm after I commented out the bits that set the PT buffer length to its maximum value as detailed in https://github.com/roboticslab-uc3m/yarp-devices/issues/198#issuecomment-487279910. This buffer has never been modified in that way before, even if we thought it was modified by us to store 15 points by the following lines:
This has been fixed in a recent revision of the iPOS CAN user manual: should have read object 2073h (Interpolated position buffer length) instead of 2074h (Interpolated position buffer configuration).
Reminder:
When I do pos->posd->pos->posd
, it doesn't change to PT mode and the launchManipulation shows:
[debug] IRemoteVariablesImpl.cpp:50 setRemoteVariable(): ptModeMs
[debug] IRemoteVariablesRawImpl.cpp:37 setRemoteVariableRaw(): ptModeMs
[warning] IRemoteVariablesRawImpl.cpp:45 setRemoteVariableRaw(): Illegal state, unable to change variable when currently used.
[warning] IRemoteVariablesImpl.cpp:58 setRemoteVariable(): Unable to set remote variable on node 0.
Could you upload the latest code? I don't see the changes we had to introduce in this method. There is a possibility the CAN network has nothing to do here, perhaps a state variable is not being correctly cleared on the iPOS-YARP device side, or some tweaks need to be made in your app.
Could you upload the latest code? I don't see the changes we had to introduce in this method. There is a possibility the CAN network has nothing to do here, perhaps a state variable is not being correctly cleared on the iPOS-YARP device side, or some tweaks need to be made in your app.
Please run your applicacion twice, make sure this error happens on the second run, and paste here the complete logs of launchManipulation.
Here you have the output information of launchManipulation: launchManipulation-pvt-test-3005.txt.zip
And the output of my app: output.txt.
Identified at https://github.com/roboticslab-uc3m/yarp-devices/issues/211#issuecomment-497280755. Suggested workaround: edit launchManipulation.ini, remove head joints.
Current (undesired/faulty) behavior:
IPositionControl -> IPositionDirect -> IPositionControl
and try topositionMove()
: need to send this command twice, no response at first try.IPositionControl -> IPositionDirect -> IPositionControl -> IPositionDirect
.ptPointCounter
is never set to zero: ref.