Open jeakidd opened 3 years ago
@jeakidd When testing the speed of a network, we only measure its network inference time, excluding its data loading, data preparation, and result saving, because these operations depend on the CPU, disk, etc, not GPU. Hence, please only test the network inference time! You can use some code like this:
import numpy as np
import torch
import time
def computeTime(model, device='cuda'):
inputs = torch.randn(30, 3, 336, 336)
if device == 'cuda':
model = model.cuda()
inputs = inputs.cuda()
model.eval()
time_spent = []
for idx in range(100):
start_time = time.time()
with torch.no_grad():
_ = model(inputs)
if device == 'cuda':
torch.cuda.synchronize() # wait for cuda to finish (cuda is asynchronous!)
if idx > 10:
time_spent.append(time.time() - start_time)
print('Average speed: {:.4f} fps'.format(30 / np.mean(time_spent)))
torch.backends.cudnn.benchmark = True
from Models.SAMNet import FastSal
model = FastSal()
computeTime(model)
Thank you very much for your reply. I see that the batchsize of your input is 30. I tried: average speed: 667.8785 FPS. But if the batchsize becomes 1, then the average speed is 21.9719 FPS. Do you use the batchsize of 30 when testing FPS in your paper?
@jeakidd Yes, I use a batch size of 30 to make better use of GPU.
@jeakidd Yes, I use a batch size of 30 to make better use of GPU.
I see. Thank you very much for your answer.
This is a valuable paper, I try to run your code, but the FPS on “3090 GPU” is low, as follows:
“Saliency prediction for ECSSD dataset [1000/1000] takes 0.050s per image”
is there something wrong with my configuration? I look forward to your reply.