tongtybj / edgetpu_roscpp

Use Edge TPU (Coral) with ROS based on C++
12 stars 2 forks source link

Co-compiled models #2

Open tongtybj opened 4 years ago

tongtybj commented 4 years ago

Following the instruction of Co-compiling multiple models, we can run multiple models in a single edgetpu device:

To speed up performance when you continuously run multiple models on the same Edge TPU, the compiler supports co-compilation. Essentially, co-compiling your models allows multiple models to share the Edge TPU RAM to cache their parameter data together, eliminating the need to clear the cache each time you run a different model.

If some of the model data cannot fit into the Edge TPU RAM, then it must instead be fetched from the external memory at run time.

In this way, we do not have to clear the pre model when execute other models. This provides great benefit for a cascaded inference, e.g., do a further detection inside the first detected bounding box.

A simple sample which co-compile two identical models are shown in https://github.com/tongtybj/edgetpu_roscpp/tree/co_compile_model

Please follow the README.md:

option: single object detection with co-compile models (two identical ssd model):

$ roslaunch edgetpu_roscpp detection_with_cocompiled_modes.launch
$ roslaunch video_stream_opencv camera.launch video_stream_provider:=`rospack find edgetpu_roscpp`/test/data/DJI_0004.MP4 loop_videofile:=true
$ rqt_image_view /deep_object_detection/detection_result
tongtybj commented 4 years ago

you can see the log is like:

[ INFO] [1580579234.999534422]: deep detection1 result:
---------------------------
drone
Score: 0.386719
Box: [1078.21, 405.347, 1389.11, 522.55] 
[ INFO] [1580579235.008327409]: deep detection2 result:
---------------------------
drone
Score: 0.386719
Box: [1078.21, 405.347, 1389.11, 522.55] 
[ WARN] [1580579235.008536109]: t1: 0.004350, t2: 0.008837
[ INFO] [1580579235.034062019]: deep detection1 result:
---------------------------
drone
Score: 0.5
Box: [1025.21, 400.258, 1397.59, 517.678] 
[ INFO] [1580579235.042483237]: deep detection2 result:
---------------------------
drone
Score: 0.5
Box: [1025.21, 400.258, 1397.59, 517.678] 
[ WARN] [1580579235.042629603]: t1: 0.004442, t2: 0.008363
[ INFO] [1580579235.067446356]: deep detection1 result:
---------------------------
drone
Score: 0.5
Box: [1045.94, 397.667, 1397.63, 517.347] 
[ INFO] [1580579235.075819296]: deep detection2 result:
---------------------------
drone
Score: 0.5
Box: [1045.94, 397.667, 1397.63, 517.347] 
[ WARN] [1580579235.075973564]: t1: 0.004498, t2: 0.008356
[ INFO] [1580579235.100358836]: deep detection1 result:
---------------------------
drone
Score: 0.5
Box: [1045.94, 393.881, 1397.63, 518.211] 
[ INFO] [1580579235.108709887]: deep detection2 result:
---------------------------
drone
Score: 0.5
Box: [1045.94, 393.881, 1397.63, 518.211] 
[ WARN] [1580579235.108833754]: t1: 0.004431, t2: 0.008295
[ INFO] [1580579235.134263420]: deep detection1 result:
---------------------------
drone
Score: 0.5
Box: [1037.36, 399.531, 1395.82, 512.561] 
[ INFO] [1580579235.142668382]: deep detection2 result:
---------------------------
drone
Score: 0.5
Box: [1037.36, 399.531, 1395.82, 512.561] 
[ WARN] [1580579235.142794926]: t1: 0.004452, t2: 0.008351

The average time of detection using model1 (entirely using the edgetpu RAM) is ~0.0044 sec, while the average time of detection using model2 (half of the model is stored in external memory in host PC) is ~0.0083 sec which is less than the twice of the model1 detection.

The average time of detection usgin model2 (~0.0083 sec) is relatively constant regardless of the spec of host PC, since this is mainly influenced by the bus speed (bandwidth) of USB3.

tongtybj commented 4 years ago

@fanshi14

This is our new hope! I will use this to do a cascaded detection: fisrt detection to extract the drone+ball bounding box and second detection of ball within the bounding box. The total time to execute these two step detection is < 0.013 sec, which is much faster that the frame rate of the camera (i.e., 30Hz)

fanshi14 commented 4 years ago

oh my holy xxxx, that would be amazing man! let us follow it!