Owen-Liuyuxuan / visualDet3D

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving / YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection
https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/GroundAwareConvultion/
Apache License 2.0
361 stars 76 forks source link

Question about inference time #49

Closed LeonChaoHi closed 2 years ago

LeonChaoHi commented 2 years ago

Hi, thanks for your code first!

I got some question about the running time. When I ran the test script and measured the inference time, it was about 0.18s per frame, much slower than 0.08s mentioned in the paper of YOLOStereo3D. The GPU device I use is TITAN X (CUDA version 9.0) which has similar efficiency with 1080 Ti (I guess?).
The way I test is simply recording the start and end time and then computing the average time:

# visualDet3D/networks/pipelines/evaluators.py/evaluate_kitti_obj()
# inference on each sample in [dataset]
time_start = time.time()
for index in tqdm(range(len(dataset_val))):
    test_one(cfg, index, dataset_val, model, test_func, backprojector, projector, result_path)
time_end = time.time()
print('total inference time: ', time_end - time_start, ' s.')
print('average time: ', (time_end - time_start) / len(dataset_val))

I wonder if the way I test is correct. How did you test the running time? Could you please afford some scripts? Or maybe it's just because my device is slower?

Appreciate!

Owen-Liuyuxuan commented 2 years ago

to take an index from the dataset takes a lot of time actually. From what I have tested on multiple servers, CPU speed and memory speed really important.

The gif in the stereo markdown is actually first recorded in real-time with pygame visualization (with 10hz+).

LeonChaoHi commented 2 years ago

Thanks for answering! I think that's the point, cause the CPU and memory of my server is not so good.