KavrakiLab / vamp

SIMD-Accelerated Sampling-based Motion Planning
Other
167 stars 22 forks source link

Implement RRT* #26

Open pranavjad opened 2 months ago

pranavjad commented 2 months ago

Implemented RRT* and added script "ao_benchmark.py" to test it.

claytonwramsey commented 2 months ago

This is very cool! I like it.

~/p/vamp (pranavjad/main|✚1) $ time taskset 0x1 python scripts/ao_benchmark.py --plot
    planning_time  planning_iterations  solved  planning_graph_size  \
0           13749                 3002    True                  244   
1           24544                 4002    True                  341   
2           36164                 5002    True                  431   
3           53677                 6002    True                  542   
4           74925                 7002    True                  645   
..            ...                  ...     ...                  ...   
92         325935                95002    True                12808   
93         437989                96002    True                12963   
94         504684                97002    True                13113   
95         595871                98002    True                13267   
96         688122                99002    True                13431   

    initial_path_vertices  initial_path_cost  simplification_time  \
0                       4           8.484445                    3   
1                       4           8.484445                    3   
2                       5           8.229109                  106   
3                       5           8.229109                  104   
4                       4           6.784612                    2   
..                    ...                ...                  ...   
92                      5           5.951281                    5   
93                      5           5.951281                    5   
94                      5           5.951281                    7   
95                      5           5.951281                    5   
96                      5           5.951281                    9   

    simplified_path_vertices  simplified_path_cost                total_time  \
0                          4              8.484445 0 days 00:00:00.013753515   
1                          4              8.484445 0 days 00:00:00.024547264   
2                         13              4.908672 0 days 00:00:00.036270648   
3                         13              4.908672 0 days 00:00:00.053782676   
4                          4              6.784612 0 days 00:00:00.074927377   
..                       ...                   ...                       ...   
92                         5              5.951281 0 days 00:00:07.325941004   
93                         5              5.951281 0 days 00:00:07.437994994   
94                         5              5.951281 0 days 00:00:07.504692931   
95                         5              5.951281 0 days 00:00:07.595876363   
96                         5              5.951281 0 days 00:00:07.688131493   

    iterations  
0         3001  
1         4001  
2         5001  
3         6001  
4         7001  
..         ...  
92       95001  
93       96001  
94       97001  
95       98001  
96       99001  

[97 rows x 11 columns]
QSocketNotifier: Can only be used with threads started with QThread

________________________________________________________
Executed in  311.24 secs    fish           external
   usr time  306.67 secs    0.00 micros  306.67 secs
   sys time    0.33 secs  451.00 micros    0.33 secs

plot

These are my timing results from running the AO benchmark (Intel i9-13900H × 20). It seems a little slow, but that might be a mirage (I just looked again and 3000 states in 13 ms is actually quite good for RRT*). I think it's possible to get some speedups somewhere and bring this into the microsecond range.

Maybe get a profile + flamegraph to see if there are any obvious hot spots.

Also, you may want to do Informed RRT* to reduce the number of unnecessary samples generated.

Otherwise, the code all looks good. I didn't see anything that seemed obviously slow or incorrect on a first pass, but it's likely possible to juice the timing numbers.

claytonwramsey commented 2 months ago

image It looks like planning times are being incorrectly reported in the final result due to truncation of seconds, so our planning times are much worse than previously thought.

In addition to informed sampling, we can also terminate on updating a child's cost once its cost + the heuristic cost exceeds the best cost so far (LPA* heuristic).

// loop through near neighbors, find the one that gives us min cost from root
std::vector<bool> collision_free(near.size());

maybe remove this possible allocation from the hot loop?

These are the only ideas I have so far for improving performance without any profiling data.

claytonwramsey commented 2 months ago

One last thing - since our runs of RRT* are deterministic, you might want to factor out a function for executing N more samples on an existing tree. Then we can call that instead of starting from scratch every time, which would at least make the benchmarking script more convenient.