Closed ModarD closed 5 years ago
Would you mind to provide more details? How did you test the speed? Is it possible to provide a small script for reproducing this?
I wrap the network forward with time.time(), for example in SAN I wrap line 60 https://github.com/D-X-Y/landmark-detection/blob/4cd4531d1088044a80a22e9d7e5c9f91d21df988/SAN/san_eval.py#L60 like this:
t1 = time.time()
batch_heatmaps, batch_locs, batch_scos, _ = net(inputs)
t2 = time.time()
t = t2 - t1
print('{:0.5f}s'.format(t))
It could be caused by the warmup procedure of GPU. You can try to run the forward procedure 100 times for warmup, and then run another 50 times of the forward and count the average time.
Thank you! that was the issue indeed
for i in range(0,100):
t1 = time.time()
batch_heatmaps, batch_locs, batch_scos, _ = net(inputs)
t2 = time.time()
t = t2 - t1
print('{:0.5f}s'.format(t))
The output is:
2.48314s
0.05905s
0.05862s
0.05740s
0.05713s
...
Thanks!
No worries.
Which project are you using?
SAN or SBR
I run SAN and SBR on google colab with Tesla K80 GPU and CUDA V10.0.130, but the execution time is always longer on GPU than it is on CPU.
SAN: GPU = 2.49453s, CPU = 1.21520s SBR: GPU = 6.39389s, CPU = 1.90000s
Any idea what could cause this issue?
Thanks!