Hi, Thanks for your great work!
When I was testing the latency on V100, the results confused me.
I used the following code to measure the latency table.
torch.cuda.empty_cache() img_L = img_L.cuda() start.record() out = ofa_network(img_L) end.record() torch.cuda.synchronize() run_time.update(start.elapsed_time(end))
The img_L is one image.
Is this correct?
Hi, Thanks for your great work! When I was testing the latency on V100, the results confused me. I used the following code to measure the latency table.
torch.cuda.empty_cache() img_L = img_L.cuda() start.record() out = ofa_network(img_L) end.record() torch.cuda.synchronize() run_time.update(start.elapsed_time(end))
The img_L is one image. Is this correct?