magazino / move_base_flex

Move Base Flex: a backwards-compatible replacement for move_base
https://uos.github.io/mbf_docs/
BSD 3-Clause "New" or "Revised" License
424 stars 154 forks source link

Use ROS rates for sleeping to support simulation time #257

Closed Timple closed 3 years ago

Timple commented 3 years ago

Solves #167

At the moment non-functional. But if you like the approach and don't see any obstacles I'll finish this implementation.

corot commented 3 years ago

Yup, make sense to use ROS Rate, as long as we can interrupt the sleep stopping the thread (currently all calls to boost::this_thread::sleep_for are potential interruption points).

Timple commented 3 years ago

Under the hood, ROS sleeps use nanosleep. Not sure the interrupting part works the same.

But the sleep is the last statement in the loop. So no code is executed in case the sleep is not interrupted. And the next cycle of course catches the interruption. Worst-case this only ads a slight delay of once cycle of doing nothing right?

corot commented 3 years ago

yes,,, but that's already a bad thing :thinking: can u make a fast experiment to see if sleep gets interrupted?

On Wed, Jan 27, 2021 at 4:37 PM Tim Clephas notifications@github.com wrote:

Under the hood, ROS sleeps use nanosleep. Not sure the interrupting part works the same.

But the sleep is the last statement in the loop. So no code is executed in case the sleep is not interrupted. And the next cycle of course catches the interruption. Worst-case this only ads a slight delay of once cycle of doing nothing right?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/magazino/move_base_flex/pull/257#issuecomment-768097276, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACOYMRLNND664EYNAEXSMTS367BXANCNFSM4WTT2XOA .

Timple commented 3 years ago

can u make a fast experiment to see if sleep gets interrupted?

http://coliru.stacked-crooked.com/a/a8fc934550deca50

The nanosleeps that ROS uses are also interrupted :slightly_smiling_face:

dorezyuk commented 3 years ago

can u make a fast experiment to see if sleep gets interrupted?

http://coliru.stacked-crooked.com/a/a8fc934550deca50

The nanosleeps that ROS uses are also interrupted slightly_smiling_face

I think the code you've posted shows that nanosleep is not an interruption point. To make it better visible you should take the end_time after calling join and increase the sleep-duration passed to nanosleep.

Also this discussion on stackoverflow suggests that nanosleep is not an interruption point

dorezyuk commented 3 years ago

int main() {

    boost::thread t([]{
        try {
            // boost::this_thread::sleep_for(boost::chrono::milliseconds(500));
            nanosleep((const struct timespec[]){{2, 500000L}}, NULL);
        }

        catch (...) { std::cout << "Interrupted\n" << std::endl; }
    });
    auto start_time = boost::chrono::steady_clock::now();
    boost::this_thread::sleep_for(boost::chrono::milliseconds(100));
    t.interrupt();
    t.join();
    auto end_time = boost::chrono::steady_clock::now();
    std::cout << "Ran for: " << boost::chrono::duration_cast<boost::chrono::seconds>(end_time - start_time)<< "\n";
}

Returns Ran for: 2 seconds

Timple commented 3 years ago

Ah, obviously I needed to put the end_time after t.join(). My bad.


So the ROS time doesn't interrupt. Would it make sense to keep the original sleep logic (albeit a duplicate of ros::rate). And replace the instances of boost::chrono::thread_clock::now(); by ros::Time::now();?

spuetz commented 3 years ago

Would it make sense to keep the original sleep logic (albeit a duplicate of ros::rate). And replace the instances of boost::chrono::thread_clock::now(); by ros::Time::now();?

Sounds reasonable. You could also just use the internal time value from ros::Time::now(), or?

Timple commented 3 years ago

ros::Time::now() already adheres to walltime or simulation time depending on what's active.

dorezyuk commented 3 years ago

Would it make sense to keep the original sleep logic (albeit a duplicate of ros::rate). And replace the instances of boost::chrono::thread_clock::now(); by ros::Time::now();?

Sounds reasonable. You could also just use the internal time value from ros::Time::now(), or?

I think swapping boost::system_clock for ros::Time will not work, since boost::this_thread::sleep_for will always sleep in system time. You would need then to scale the sleep_time somehow.

We might do something ugly like

void mbf_sleep(const ros::Duration& _dur) {
 const ros::Time end = ros::Time::now() + _dur;
 while(ros::Time::now() < end) {
    // this is now an interruption point
    boost::this_thread::sleep_for(boost::chrono::milliseconds(10));
 }
}

But thats very close to busy waiting

spuetz commented 3 years ago

It should be possible to compute the time delta and estimate the sleep duration in the time system of choice.

Timple commented 3 years ago

What goes wrong if we do not interrupt?

  1. The publishZero() is delayed by max 1/control_freq

Anything else?

spuetz commented 3 years ago

The question should be what goes wrong if we interrupt and nothings happens, as in the most cases anyway, since there are most likely no interruption points in the respective controller plugins and the duration after executing the plugin to to the end of the loop is negligible short. The important thing here is more that we could interrupt plugins / beside cancelling, meaning that there could be interruption points in the plugin. So from my point of view it is fine if the sleep duration is small enough.

dorezyuk commented 3 years ago

It should be possible to compute the time delta and estimate the sleep duration in the time system of choice.

The issue I see here is that the time-scaling may vary from loop to loop. Imagine you run Gazebo and press pause for example. In this particular case the time "freezes" (which would make us sleep for a long time) and its impossible to predict when it will resume ticking

spuetz commented 3 years ago

The issue I see here is that the time-scaling may vary from loop to loop. Imagine you run Gazebo and press pause for example. In this particular case the time "freezes" (which would make us sleep for a long time) and its impossible to predict when it will resume ticking

Ok, sure the pause is a special case, but computing the sleep duration assuming a linear time scaling should be doable. The pause could be detected by a time delta which equals zero, right?

Anyway. I suggest to ignore the interruption here, as currently it is unused in all cases and the more important thing is interrupting plugins, which is still possible.

Timple commented 3 years ago

It was functional in simulation before the last commit. I assume it still is.

I can run some tests tomorrow!

Timple commented 3 years ago

The dynamic reconfigure of the frequency seems to be broken. When I change it from 20hz to 100hz I get the following message:

Calculation needs too much time to stay in the moving frequency! (0.0400s > 0.0100s)

So somehow the rate doesn't change but the reported expected rate does... I tried re-initialising the class with std::make_unique but that results in the same behavior. Any suggestions here?

spuetz commented 3 years ago

Hmmm 20Hz should be 0.05sec. How often does the message show up? Line 91 should to the trick loop_rate_ = ros::Rate(frequency); I'm wondering why it does not.

So maybe 0.04 sec is what the plugin needs?

Timple commented 3 years ago

Ah yes, that was it. I thought it was exactly the frequency.

0.04 is exactly the discretisation step used in my current simulation. So obviously everything takes at least that long...

I'm good, it's working now!

corot commented 3 years ago

thanks! :+1:

Timple commented 3 years ago

Thank you for the fast review cycles. Didn't expect to get so much traction :+1:

Timple commented 3 years ago

All is operational for two weeks, seems to work :+1: Would this warrant a new release? :angel:

spuetz commented 3 years ago

Sure, end of the week I'll make a new version. Cheers.