Closed mpowelson closed 3 years ago
Starting off with a callgrind calltree of a motion planning pipeline. For this particular run the processes were
Main takeaway: We need to get rid of the environment clone in WaypointInCollision
@mpowelson I could not find where you posted the performance for the tesseract clone but it did not look correct the functions it was calling so I took a look. If you replace the clone method with what is below does it performa better?
Tesseract::Ptr Tesseract::clone() const
{
auto clone = std::make_shared<Tesseract>();
clone->environment_ = environment_->clone();
clone->manipulator_manager_->clone(clone->environment_);
clone->init_info_ = init_info_;
clone->initialized_ = initialized_;
clone->find_tcp_cb_ = find_tcp_cb_;
return clone;
}
Sorry, I posted it in the other (closed) thread about cloning the environment. I'll post it here to keep it all in one place. Also, I made this screenshot before I realized I needed to turn up the minimum % visible. So some of the functions won't be here if they are below 5%. I'll have to do some more benchmarking to get absolute performance, but I just thought you'd be interested to see how the manipulator manager weighed in the init function. In general, I've also been somewhat surprised how much time is spent in the URDF parser.
Ah, thank you. I believe the change above should improve the performance of the clone.
Yeah that does look like it should be faster. Before we were cloning the environment twice.
It sure was.
I added a benchmark for cloning. It is not comprehensive, but cloning a Tesseract initialized with the lbr iiwa srdf. I change the clone to this
Tesseract::Ptr Tesseract::clone() const
{
auto clone = std::make_shared<Tesseract>();
if (environment_)
clone->environment_ = environment_->clone();
if (clone->environment_)
if (manipulator_manager_)
clone->manipulator_manager_ = manipulator_manager_->clone(clone->environment_);
clone->init_info_ = init_info_;
clone->initialized_ = initialized_;
clone->find_tcp_cb_ = find_tcp_cb_;
return clone;
}
Before:
--------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------
BM_TESSERACT_CLONE/real_time 3560 us 3560 us 193
After:
--------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------
BM_TESSERACT_CLONE/real_time 235 us 235 us 2778
I added it to #409
That looks a lot better.
I was thinking about trying out https://github.com/yse/easy_profiler for getting real world timings of a motion planning pipeline. Any thoughts? Have you used anything else? This one looks simple enough to use. It claims it only slows it by 1-2%
Looks good to me. I did add the ability to turn on profiling for just the taskflow which outputs a json file in the temp directory to load in the profTaskflow tool. Thought this could be used through the code. Is it actively being maintained?
I forgot about the taskflow profiling. That might be useful to look at too.
Code frequency has definitely fallen off, but it was last released end of 2019. It may just be mature enough that it doesn't need much more development these days. That said, I just picked the top google hit for "lightweight C++ profiler github"
I think "Get rid of the environment clone in WaypointInCollision" may have been already done, but I did a bit of cleanup in #444.
Using this thread to keep track of improvements found while profiling