yun-liu / FastSaliency

Code for "SAMNet: Stereoscopically Attentive Multi-scale Network for Lightweight Salient Object Detection" and "Lightweight Salient Object Detection via Hierarchical Visual Perception Learning"
54 stars 17 forks source link

Confusion about FPS #1

Open jeakidd opened 3 years ago

jeakidd commented 3 years ago

This is a valuable paper, I try to run your code, but the FPS on “3090 GPU” is low, as follows:

“Saliency prediction for ECSSD dataset [1000/1000] takes 0.050s per image”

is there something wrong with my configuration? I look forward to your reply.

yun-liu commented 3 years ago

@jeakidd When testing the speed of a network, we only measure its network inference time, excluding its data loading, data preparation, and result saving, because these operations depend on the CPU, disk, etc, not GPU. Hence, please only test the network inference time! You can use some code like this:

import numpy as np
import torch
import time

def computeTime(model, device='cuda'):
    inputs = torch.randn(30, 3, 336, 336)
    if device == 'cuda':
        model = model.cuda()
        inputs = inputs.cuda()


    time_spent = []
    for idx in range(100):
        start_time = time.time()
        with torch.no_grad():
            _ = model(inputs)

        if device == 'cuda':
            torch.cuda.synchronize()  # wait for cuda to finish (cuda is asynchronous!)
        if idx > 10:
            time_spent.append(time.time() - start_time)
    print('Average speed: {:.4f} fps'.format(30 / np.mean(time_spent)))

torch.backends.cudnn.benchmark = True

from Models.SAMNet import FastSal
model = FastSal()

jeakidd commented 3 years ago

Thank you very much for your reply. I see that the batchsize of your input is 30. I tried: average speed: 667.8785 FPS. But if the batchsize becomes 1, then the average speed is 21.9719 FPS. Do you use the batchsize of 30 when testing FPS in your paper?

yun-liu commented 3 years ago

@jeakidd Yes, I use a batch size of 30 to make better use of GPU.

jeakidd commented 3 years ago

@jeakidd Yes, I use a batch size of 30 to make better use of GPU.

I see. Thank you very much for your answer.