fps - Githubissues

huixiancheng / CENet

[ICME 2022] CENet: Toward Concise and Efficient LiDAR Semantic Segmentation for Autonomous Driving

MIT License

100 stars 13 forks source link

fps #14

Closed fengluodb closed 1 year ago

fengluodb commented 1 year ago

I use your model in my project. But the fps is different with that your paper show. With size being 512x64,

********************************************************************************
Cleaning point-clouds with kNN post-processing
kNN parameters:
knn: 7
search: 7
sigma: 1.0
cutoff: 2.0
nclasses: 20
********************************************************************************
Infering in device:  cuda
100%|███████████████████████████████████████| 4071/4071 [02:04<00:00, 32.75it/s]
Mean CNN inference time:0.01585113       std:0.01862994
Mean KNN inference time:0.00275165       std:0.00063380
Total Frames: 4071
Finished Infering

The fps is 67, lower than 84.9 in your paper.

I infer the valid dataset on 3090.

huixiancheng commented 1 year ago

Sorry for long time passed and I may have forgotten some details. Due to some equipment differences, there can be huge differences in speed tests.

For your question, there may be the following points to note：

I used FIDNet's code for benchmarking instead of this, which will bring some difference.
When reasoning with this codebase, the auxiliary segmentation header is not removed, which reduces speed.
Please warm up the gpu for a while before testing the speed. For example, start counting time after 10/25 samples.
I used fp16 for inference and for RTX 3080 it should be faster than fp32.
Only the average time of 100 samples was counted, not the average time of the validation set.

huixiancheng commented 1 year ago

This is the original data I counted at that time, the code may have been lost. You will find that the KNN is fast because I miscalculated the NLA time at that time........

fengluodb commented 1 year ago

Thank you for your reply. If I remove the auxiliary segmentation header, the speed may be faster. I will try when my gpu is free.

shawnding commented 7 months ago

Hi @fengluodb @huixiancheng ,

Thanks for the work! A minor question about the FPS. I can see in the log it shows 32.75 it/s, but the Mean CNN inference time is 0.01585113, which means ~63 FPS. Why do they not match? Is there some other latency besides the CNN inference time?

huixiancheng commented 7 months ago

Hi~! @shawnding KNN time should add into all time. Since CNN infer label can't gather the 3d results. Moreover, this value may not be very reliable, since it depends on your hardware and related computing code.

shawnding commented 7 months ago

Thanks for your kind reply~ However, adding CNN time and KNN time together would be 0.018s (~55 FPS), which still doesn't match with 32.75 it/s. Is there any other overhead such as data processing, or am I missing something here...

huixiancheng commented 7 months ago

Hi~ @shawnding You mean the tqdm value and the FPS results from Feng? I guess part diff come from data processing & torch.cuda.synchronize() & results save. Here is the code maybe he used. https://github.com/huixiancheng/CENet/blob/9a84103d186a1f93637cae3d96426760deb04140/modules/user.py#L126-L220 By the way, do you think the iteration time of tqdm is meaningful? It only depends on the complexity of your for loop code.

shawnding commented 7 months ago

That makes sense now. Thanks for the elaboration!