Code optimization - Githubissues

chaupc commented 5 years ago

Dear Davide:

Sorry to bother. I am doing some research on comparing the searching time on the shortest path problem base on grid map. I downloaded your codes and it can run without any problem. May I know that anything that I can do in order to optimize the code even more? And do you know that what is the fastest searching algorithm so far? So that I can test.

Thanks & B. Regards Patrick

chataign commented 5 years ago

Hi Patrick,

As it happens I was just looking at optimizations for this code today. The most obvious thing is to compile the library using the -O3 or -Ofast flags: set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Ofast -march=native -Wall")

In my tests I found that this alone improved the performance dramatically, I'd be very interested if you could test it also to get a benchmark. You can find some good information here: https://stackoverflow.com/questions/14492436/g-optimization-beyond-o3-ofast

I have also been looking at some code optimizations, but I haven't yet seen a significant improvement. I'll keep trying and I also want to test the effect of the optimization flags on Windows 10.

Regards, Francois

On Fri, Oct 4, 2019 at 3:53 PM chaupc notifications@github.com wrote:

Dear Davide:

Sorry to bother. I am doing some research on comparing the searching time on the shortest path problem base on grid map. I downloaded your codes and it can run without any problem. May I know that anything that I can do in order to optimize the code even more? And do you know that what is the fastest searching algorithm so far? So that I can test.

Thanks & B. Regards Patrick

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Eurecat/astar-gridmap-2d/issues/5?email_source=notifications&email_token=ADUKZ4QMMDX2DJGPE5RFK6TQM5DFZA5CNFSM4I5QDCVKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HPV255Q, or mute the thread https://github.com/notifications/unsubscribe-auth/ADUKZ4RRPWBQ7UZAPO2DKTLQM5DFZANCNFSM4I5QDCVA .

facontidavide commented 5 years ago

Optimizer flags are cheap tricks. No one can optimize my code, no one! I am the only one able to do it, buahahaha

(laughing maniacally)

But of course you can stop using A* and use something else ;)

chataign commented 5 years ago

I'm happy to admit that the code is nice as it is, so there isn't much room for optimization :) The optimizer flags seem to make a big difference though

chaupc commented 5 years ago

Hi Francois:

I just run the code again by setting the compiler flags that suggested, it really improves the performance a lot, please check the following results.

Before optimization: $ ./build/benchmark 2019-10-05 07:45:41 Running ./build/benchmark Run on (8 X 3800 MHz CPU s) CPU Caches: L1 Data 32K (x4) L1 Instruction 32K (x4) L2 Unified 256K (x4) L3 Unified 8192K (x1) Load Average: 0.04, 0.07, 0.14

Benchmark Time CPU Iterations

BM_AStar_Big 1911860219 ns 1909029534 ns 1 BM_AStar_Smooth_1000 1363989087 ns 1361915866 ns 1 BM_AStar_Small 5070272 ns 5063018 ns 133
After optimization: BM_AStar_Big 175500144 ns 175166215 ns 4 BM_AStar_Smooth_1000 123446411 ns 123190568 ns 6 BM_AStar_Small 382802 ns 382080 ns 1899

And in fact, I am writing my solution for the shortest path problem, pls check the following results:

Before optimization: BM_LS_Big 458208892 ns 457507908 ns 2 BM_LS_Smooth_1000 45481337 ns 45399325 ns 16 BM_LS_Small 2159268 ns 2153314 ns 330
After optimization BM_LS_Big 27097026 ns 26997710 ns 24 BM_LS_Smooth_1000 5558547 ns 5520143 ns 115 BM_LS_Small 192361 ns 191769 ns 3632

All the above tests are using the same image file.

Would you pls let me know where I can download some more files for testing, I tried google for the image files, some can run, while some can't. Interestingly, when it fails to run, both your code & mine are also failed.

Regards Patrick

chaupc commented 5 years ago

Hi Francois:

One thing forget to mention, my solution doesn't using the A* approach.

Regards Patrick

facontidavide commented 5 years ago

Tiny optimization here (15%), using the power of the CPU branch prediction instead of one level of indirection.

https://github.com/Eurecat/astar-gridmap-2d/pull/7

chaupc commented 5 years ago

Hi Francois & Davide:

I just found there is a competition about grid based path planning almost every year, different algorithms are used, details can be found from the website www.movingai.com, and results of 2014 can be from from the following URL:

https://www.cs.du.edu/~sturtevant/papers/GPPC-2014.pdf

Codes can be found from the following URLs:

https://github.com/maxrenke/gppc-2014.git https://github.com/nathansttt/hog2.git

I am integrating my algorithm to see the difference.

Regards Patrick

facontidavide commented 5 years ago

There is no way this vanilla A* can compare with those algorithms. It is not the purpose of this repository

chaupc commented 5 years ago

Hi Davide:

I totally understand that.

I am working for robot vacuum cleaner project, which using the STM32F0x series CPU, so I have to make sure my algorithm as effective as possible, that is why I am looking for some 3rd party's resources to compare with mine. Your repository is really great to me already, and I really appreciate for your works.

Regards Patrick

facontidavide commented 5 years ago

You probably need something MUCH more efficient than this, both in terms of memory usage (we use a lot of RAM) and CPU. I think you will eventually need something based on Jump Point Search (JPS). Anyway, I am happy if you find this useful. Let me know if you eventually compare the results of this implementation with GPPC-2014, I am curious :)

chaupc commented 5 years ago

Hi Davide:

Yes, you are right, the memory usage is another constrain.

I sent an email to Prof. Nathan and got the reply from him, he said he would provide a shell script to me which could generate the test result, I am waiting for that.

For what I tested so far, it looks like my algorithm is almost as efficient as BLJPS, but still much to optimize while compare will NSubgoal.

After I have the script from Prof. Nathan and have the results, I will let you know.

Regards Patrick

facontidavide commented 5 years ago

would you share your code? I will be happy to help you optimizing it

chaupc commented 5 years ago

Hi Davide:

Sure, after I receive the shell script from Prof. Nanthan, and have the preliminary test result.

Regards Patrick

facontidavide commented 5 years ago

Now that I read more about JPS algorithm, I think I will implement it for fun in a new repository. There is no way A* can compete and the fact that the path has less nodes is nice.

chaupc commented 5 years ago

Wah, maybe you will be one of the participant of the coming GPPC event. Where to find your repository, shared already? I think it is also worth to take a look at NSubgoal.

facontidavide commented 5 years ago

Since I am doing this for fun only, I guess I will focus on my own JPS++ only ;)

I have found this https://github.com/KumarRobotics/jps3d It is "only" twice as fast compared with my A* implementation, therefore I am pretty sure it can be optimized even further.

Once I start working on this, I will let you know.

For the record, I am not interested (personally) to any implementation that require a lot of pre-computing for each map.

Eurecat / astar-gridmap-2d

Code optimization #5

Before optimization: $ ./build/benchmark 2019-10-05 07:45:41 Running ./build/benchmark Run on (8 X 3800 MHz CPU s) CPU Caches: L1 Data 32K (x4) L1 Instruction 32K (x4) L2 Unified 256K (x4) L3 Unified 8192K (x1) Load Average: 0.04, 0.07, 0.14

Benchmark Time CPU Iterations