Fix overflow issue with ActuatorState timestamp_ (ros ticket #5002)

ahendrix commented 11 years ago

There is the timestamp_ value in ActuatorState, that is based on the timestamp coming from the MCB.

{{{ class ActuatorState{ double timestamp_; //!< The time at which this actuator state was measured (in seconds) }; }}}

However, the code for generating this value will overflow.

{{{ bool WG0X::unpackState(unsigned char this_buffer, unsigned char prevbuffer) { ... state.timestamp = thisstatus->timestamp / 1e+6; ... } }}}

The timestamp is a 32 bit number that counts microseconds. This will overflow every 72 minutes. The timestamp will jump by -72minutes once every 72 min.

I could augment the timestamp to 64 bits and then convert to the timestamp to fix this issue.

{{{ bool WG0X::unpackState(unsigned char this_buffer, unsigned char prevbuffer) { timestamp64 += (thisstatus->timestamp - prevstatus->timestamp); ... state.timestamp = double(timestamp64) / 1e+6; ... }

}}}

trac data:

Owner: dking
Reporter: dking
Reported at: Fri Feb 4 17:43:47 2011
URL: https://code.ros.org/trac/wg-ros-pkg/ticket/5002

ahendrix commented 11 years ago

[dking] Email comment from Gunter

Looking at the transmissions, it gets even more fun. It takes the double timestamp, converts it into a ros::Time and then calculates dt back into a double...\

void PR2BeltCompensatorTransmission::propagatePosition( std::vector<pr2_hardwareinterface::Actuator>& as, std::vector<JointState>& js) { ros::Time time(as[0]->state.timestamp_); double dt = (time - lasttime).toSec(); ...

So the good thing is the transmission is actually using the best time stamp. The bad thing, after 72 minutes we will get a massive spike! So I think we need to do something... Let me see what folks say.

ahendrix commented 11 years ago

[dking] The current plan is to fix the overflow issue {{{double timestamp}}} and also to provide a ros::Time timestamp value. The ros::Time value should the preferred value to use, and the double is being left in for backwards compatibility.

Any idea's on a variable name for the ros::Time timestamp? (timestamp_ is already taken)

ahendrix commented 11 years ago

[gunter] Awesome. I like the solution. As for names, how about something explicit: encoder_sampletime ?

BTW, does the same overflow issue occur when calculating the encoder_velocity:

state.encodervelocity = double(int(this_status->encodercount - prev_status->encodercount)) / (thisstatus->timestamp - prevstatus->timestamp) * 1e+6;

ahendrix commented 11 years ago

[dking] {{{ state.encodervelocity = double(int(this_status->encodercount - prev_status->encodercount)) / (thisstatus->timestamp - prevstatus->timestamp) * 1e+6; }}}

I believe these work because each subtraction is done as integers and then differential position/time are converted to doubles before division is performed. {{{ int(this_status->encodercount - prev_status->encodercount) }}} {{{ (thisstatus->timestamp - prevstatus->timestamp) }}}

However, I will put this formula up into a separate function that I can run a unit test on with known inputs and expected outputs.

ahendrix commented 11 years ago

[dking] Replying to [comment:3 gunter]:

Awesome. I like the solution. As for names, how about something explicit: encoder_sampletime ?

BTW, does the same overflow issue occur when calculating the encoder_velocity:

state.encodervelocity = double(int(this_status->encodercount - prev_status->encodercount)) / (thisstatus->timestamp - prevstatus->timestamp) * 1e+6;

Maybe something more generic sounding than {{{encoder_sampletime}}}? Most/all realtime data from MCB is sampled at exact same time.

ahendrix commented 11 years ago

[dking] Fixed timestamp internally (r48822).

ahendrix commented 11 years ago

[dking] r48823 added {{{sampletimestamp}}} to pr2_hardware_interface. Stu, can you re-release pr2_mechanism to unstable/diamondback?

ahendrix commented 11 years ago

[gunter] Derek, thanks - this is great! Just for me to understand, with the sampletimestamp being a ros::Time, what is the "zero time"? Is it in the same units as ros::Time::now() or does it start at zero when the ethercat process starts? Which (I think) is what the old time_stamp did? Thanks.

ahendrix commented 11 years ago

[sglaser] I've released pr2_mechanism into unstable. Still not sure if I have "permission" to release into diamondback yet, but it is in unstable.

ahendrix commented 11 years ago

[dking] Replying to [comment:9 gunter]:

Derek, thanks - this is great! Just for me to understand, with the sampletimestamp being a ros::Time, what is the "zero time"? Is it in the same units as ros::Time::now() or does it start at zero when the ethercat process starts? Which (I think) is what the old time_stamp did? Thanks.

The old timestamp_ value was directly related to timestamp from MCB board, which started at 0 when MCB was powered up.

The new timestamp is based on Sum(Diff(timestamp)). The new timestamp will start at zero when pr2_etherCAT is started.

Both new and old ActuatorState::timestamp_ values are not connected to the {{{ros::Time::now()}}} or {{{robot->getTime()}}} value.

ahendrix commented 11 years ago

[dking] Replying to [comment:10 sglaser]:

I've released pr2_mechanism into unstable. Still not sure if I have "permission" to release into diamondback yet, but it is in unstable. Permission from Ken?

ahendrix commented 11 years ago

[sglaser] As far as I know, Ken's policy is still that we can release to unstable, but that he is the one to push things to diamondback. It still means that this change will eventually wind up in diamondback, I'm just not sure precisely when. For now it is released into unstable.

ahendrix commented 11 years ago

[hsu] Looks like setting the new timestamp to ros::Time::now() in the propagate*Backwards() calls is sufficient for simulation, right?

ahendrix commented 11 years ago

[gunter] By construction, the only place that uses the new time stamp is the belt transmission. And it only calculates a dt. So YES that should be sufficient.

That said, it does mean than in simulation the time_stamp has a very different value than in the real system. I'm worried that somewhere this could lead to inconsistency (and an annoying bug to find). How would it be to either

(a) in simulation set the time stamp to time since the "ethercat" was started? So that in both cases they are relative times.

(b) initialize the new (real) sampletimestamp to the time the first ethercat package was received? So that it reads an (approximate) absolute time?

ahendrix commented 11 years ago

[dking] Replying to [comment:15 gunter]:

By construction, the only place that uses the new time stamp is the belt transmission. And it only calculates a dt. So YES that should be sufficient.

That said, it does mean than in simulation the time_stamp has a very different value than in the real system. I'm worried that somewhere this could lead to inconsistency (and an annoying bug to find). How would it be to either

(a) in simulation set the time stamp to time since the "ethercat" was started? So that in both cases they are relative times.

(b) initialize the new (real) sampletimestamp to the time the first ethercat package was received? So that it reads an (approximate) absolute time? I think '''(b)''' would allow people to mix {{{sampletimestamp}}} and {{{ros::Time::now()}}}.

{{{ // Interpolate actuator position at current time using velocity and time // difference between current time and time the actuator data was collected double sample_delay = ros::Time::now() - actuator->sampletimestamp; double currentposition = actuator->position + actuator->velocity_ * sample_delay; }}}

I would argue that you want to avoid allowing this to happen. If the {{{sampletimestamp}}} never matches {{{ros::Time::now()}}}, then the above code would suck immediately instead of slowly degrading as clocks in computer and MCB drifted apart.

ahendrix commented 11 years ago

[gunter] Fair enough. Part of me thinks that a big aspect of ROS is the fact that all sensor data is time stamped according to a global clock, so that all data can be related to each other. But I appreciate that dealing with different clocks, drift, and synchronization would be a big head-ache for something we aren't even interested in doing.

Though can I ask a final question. If we agree that the sample_timestamp is relative to the start of pr2_ethercat, then should it be a ros::Duration vs. ros::Time? Would that make it more obvious and prevent some inappropriate usage, for example setting sampletimestamp = ros::Time::now() in Gazebo? Is it sample_time_sincestart ?

ahendrix commented 11 years ago

[gunter] Okay. I think this is done. I checked in (into trunk) both the pr2_mechanism and pr2_ethercat_hardware) stacks with the following changes:

a) wg0x uses ros::Duration instead of ros::Time b) hardware_interface defines ActuatorState with a ros::Duration sampletimestamp c) the belt transmission uses this new timestamp d) all transmissions reset both the new and old timestamps.

I ran this on PRL - things seemed to be good. John, could you double-check Gazebo? Stu, could you do the release magic? Thanks all around!

-gunter

ahendrix commented 11 years ago

[sglaser] I just released pr2_mechanism 1.4.4, which contains the sample timestamp in the actuator message. Derek, you should be able to update and release pr2_ethercat_hardware now.

ahendrix commented 11 years ago

[hsu] Gunter, gazebo tested normal. Thanks for the updates.

ahendrix commented 11 years ago

[dking] Ported Gunter's change (r48939) from trunk to branches-1.4 (r48991)

ahendrix commented 11 years ago

[dking] Released pr2_ethercat_drivers 1.4.1 (2010-02-12).

PR2 / pr2_ethercat_drivers

Fix overflow issue with ActuatorState timestamp_ (ros ticket #5002) #51